pdf2table
pdf2table is a node.js library that attempts to extract tables from a pdf.
The ‘tables’ are extracted as an array of rows.
It uses pdf2json to extract the pdf data.
Install
You can install pdf2table using the Node Package Manager (npm):
1 2 |
npm install pdf2table |
Simple example
1 2 3 4 5 6 7 8 9 10 11 12 |
var pdf2table = require('pdf2table'); var fs = require('fs'); fs.readFile('./test.pdf', function (err, buffer) { if (err) return console.log(err); pdf2table.parse(buffer, function (err, rows, rowsdebug) { if(err) return console.log(err); console.log(rows); }); }); |