1

i'm developing a web application with a lot of event to track in it. I'll install the application in ca. 10 web server and I need to have the tracking event saved in a db in order to be analyzed.

I'll receive 100K events/minutes = 144 millions events a day an event row is (event type, user id, object id, context id, session id, timestamp)

I think about store them in a MyISAM table and then, every day, change the table name according to the date (i.e. log20090826, log20080827 and so on). Have you other/best ideas? I can use, if more performant, other RDBMS.

another question: is there any method to know the timestamp of a row without having it explicitly written (no space used)?

thanks, Andrea

2 Answers2

2

For such a big volume of data I would recommend you to use text log files and to parse them. In 6 months you'll have 26,000 millions records and I'm sure you'll not be able to analyze them using SQL.

Also if this is important you could try to log the data in BigTable or Hadoop. These not-SQL databases will do the job quite well due to your data model.

Here is an article that you help you.

sorin
  • 8,016
  • 24
  • 79
  • 103
  • thanks. I never installed Hadoop: is it simple? can it cooperate with .NET applications? which type of hardware is needed? –  Aug 28 '09 at 21:59
0

MyISAM is OK, just make sure you use the "INSERT DELAYED" prepared query, so the server relaxes a bit :)

kolypto
  • 11,058
  • 12
  • 54
  • 66
  • thanks for your answer. what about the possibility to not insert the time stamp? Andrea –  Aug 27 '09 at 13:32
  • TIMESTAMP field type is auto-filled with a right timestamp when you insert NULL in it :) see the manual, that's handy – kolypto Aug 28 '09 at 18:30
  • A row _MUST_ contain a timestamp, else there's no way to know it. If you don't need an exact timestamp - you may approximate it: if you have 1000 log entries for one day, then the first record is somewhere near 00:00 and the last is 23:59. Others are in the middle, linearly. No other methods to save space, sorry :) You can use compressed tables to save space, but this will slow down operations on that table. – kolypto Aug 28 '09 at 18:33
  • thanks. I think I should try some approximation technique. Andrea –  Aug 28 '09 at 19:37