22

I'm new to puppeteer. I used to have PhantomJS and CasperJS but while setting a newer server (freebsd 12) found out that support for PhantomJS is gone and CasperJS gives me segmentation faults.

I was able to port my applications to puppeteer just fine but ran into the problem that when I want to capture data from a table, this data seems to be incomplete or truncated.

I need all the info from a table but always end up getting less.

I have tried smaller tables but it also comes out truncated. I don't know if the console.log buffer can be extended or not, or if there is a better way to get the values of all tds in the table.

const data = await page.$$eval('table.dtaTbl tr td', tds => tds.map((td) => {
    return td.innerHTML;
}));

console.log(data); 

I should be able to get all rows but instead I get this

[ 'SF xx/xxxx 3-3999 06-01-16',
'Sample text - POLE',
  '',

 /* tons of other rows (removed by me in this example) <- */

  '',

 /* end of output */ ... 86 more items ]

I need the 86 other items!!! because I'm having PHP pick it up from stdout as the code is executed.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
John Ralston
  • 233
  • 1
  • 2
  • 5

4 Answers4

40

Why console.log does not work

Under the hood, console.log uses util.inspect, which produces output intended for debugging. To create reasonable debugging information, this function will truncate output which would be too long. To quote the docs:

The util.inspect() method returns a string representation of object that is intended for debugging. The output of util.inspect may change at any time and should not be depended upon programmatically.


Solution: Use process.stdout

If you want to write output to stdout you can use process.stdout which is a writable stream. It will not modify/truncate what you write on the stream. You can use it like this:

process.stdout.write(JSON.stringify(data) + '\n');

I added a line break at the end, as the function will not produce a line break itself (in contrast to console.log). If your script does not rely on it you can simply remove it.

Thomas Dondorf
  • 23,416
  • 6
  • 84
  • 105
  • 4
    @ThomasDondorf @JohnRalston FWIW, this `console.log()` behavior concerns direct object output, strings should not be truncated, so `console.log((JSON.stringify(data))` can suffice. – vsemozhebuty Apr 02 '19 at 19:04
  • @vsemozhetbyt Yeah, that should work too. But I would rater write to the stream directly, instead of relying on a function that says "should not be depended upon programmatically". Or is there any advantage of using `console.log`? – Thomas Dondorf Apr 02 '19 at 19:39
  • @ThomasDondorf I suppose there is no advantage except simplicity. – vsemozhebuty Apr 02 '19 at 20:13
  • Uncaught ReferenceError: process is not defined. why? please guide me. – cSharma Sep 03 '19 at 06:21
  • @cSharma Are you in the browser environment? This question was about Node.js. You can try `console.log` instead of `process.stdout.write` in the browser. – Thomas Dondorf Sep 03 '19 at 16:03
  • 4
    if you want to keep the colors that `console.log()` provides while controlling how your data is truncated, you can use `process.stdout.write(util.inspect(mydata, {colors: true, depth: 5, maxArrayLength: 20}))` – Cédric Jan 05 '20 at 08:48
  • its still truncated for me in node 15 with `process.stdout.write` – chovy Oct 31 '20 at 01:15
  • I tried your solution in a React Native app, but it does not work. Do you have an alternative? – Contestosis Feb 11 '22 at 13:13
8

You can also use

console.log(JSON.stringify(data, null, 4)); 

instead of

process.stdout.write(JSON.stringify(data) + '\n');
ggorlen
  • 44,755
  • 7
  • 76
  • 106
Waqqas Sharif
  • 173
  • 1
  • 5
1

I know the question is from a couple of years ago, but this has been an issue I've seen time and time again. Discovering (through this thread) the underlying util.inspect call has helped me to overcome this issue in the following way:

process.stdout.write(`${util.inspect(data, { maxArrayLength: 1000 })}\n`)

By default maxArrayLength is 100 which is why the data is truncated for longer arrays.

Duncan
  • 11
  • 1
-1

Do you absolutely have to use stdout? It's not recommended to do that for monitoring because it's very easy for stdout to overrun the buffer (or have incomplete output) - as you've seen illustrating the problem.

Why not modify the PHP script to read from a file as a stream using the readfile function, and write to that stream from your JS code using fs?

  • thanks for your reply, when I started with phantomjs I used to write to file and have php parse it afterwards like you described. Later I combined both into one step. I will revert back to that, I was hoping that there was a way of printing the entire innerhtml while the script is running. thanks again – John Ralston Apr 02 '19 at 04:19