2

I'm running a simple bigQuery over my dataset which is about 84GB of log data.

The query takes approx 110 seconds to complete. Is this normal for a data set of this size?

anataliocs
  • 10,427
  • 6
  • 56
  • 72
aloo
  • 5,331
  • 7
  • 55
  • 94

1 Answers1

2

After further investigation, it looks like your table was heavily fragmented. We usually have a coalesce process running to prevent this situation, but it had been off for a couple of weeks while we were verifying a bug fix. I've restarted the coalescer and run it against your table. Please let me know if you continue to see poor performance.

As a best practice, you may be better off importing somewhat less frequently in larger chunks, or splitting your data into time-based tables. BigQuery isn't really designed to handle high-volume small imports to the same table.

Jordan Tigani
  • 26,089
  • 4
  • 60
  • 63
  • Project id: 326440123436 and the query was just a simple: SELECT timestamp FROM [streaklogsdataset.log_faaf98_00000001353024000000_00000001355616000000] order by timestamp desc LIMIT 1; – aloo Dec 06 '12 at 22:35
  • Jordan - any luck investigating the issue? – aloo Dec 09 '12 at 20:38
  • Sorry that was a contrived query example. Here's a more realistic one: SELECT errorType, errorTrace, uid, timestamp FROM [streaklogsdataset.log_faaf98_00000001353024000000_00000001355616000000] where httpStatus >= 500 order by uid asc, timestamp desc limit 500; – aloo Dec 11 '12 at 23:01
  • OK I just looked at your table. It is in 18k fragments. Our coalescer that compacts tables has been paused for a couple of weeks and only recently restarted. Suggestion: run a table copy job to copy it to a new table and use that, or do an export to json and re-import. (we're restarting the coalescer, but it may be 24 hours or so before it catches up). – Jordan Tigani Dec 11 '12 at 23:23
  • OK your table has been fixed. Let me know if you continue to see performance problems with the table. – Jordan Tigani Dec 12 '12 at 01:42
  • The query ran but wasn't exactly fast - it took 82.1 seconds to run. – aloo Dec 12 '12 at 03:26
  • Jordan, hi. Is your comment `BigQuery isn't really designed to handle high-volume small imports to the same table` is still valid for 2014 with insertAll functionality that we have now? – Max Aug 06 '14 at 06:58