2

This is a controversial issue for sure, since many might want to recommend the simple things right away, like: Split up the tables! Split up the read/write into master/slave configurations! Amp up the server ram! And so forth .... let me explain the issue first though:

I have a somewhat powerful server: 8GHz, 160GB Storage, 8GB RAM (16GB Flexi RAM), RAID 10, 16GB Flexi-SSD. Running mySQL, PHP, Apache, Debian.

My current database consists of about 16 tables, where one in particular contains 1.7GB of information, with 23 million rows (indexed).

I run a service that requires daily, sometimes hourly scans of data I receive through a third party and which produces between 100 new rows per minute to a maximum of 5000 rows per minute or so (rarely). The data is fetched through a crawler getting it from an API and these crawlers run automated, scheduled and sometimes ad-hoc, so they are write-heavy to the master.

When people use the site, there will be refreshed queries available to show them their latest analytics data, so this is, when a lot of people are logged in, extremely read-heavy (I worked with slow queries and tried to reduce everything with indexes where I could). I produce these analytics on the fly out of the DB (they are a maximum of 24 hours old) and can consist of up to 5 million records summed up per user. I don't think it would make sense to pre-render these queries, since I'd have to account for all the slicing / filtering as well somehow in the pre-rendered HTML files.... right? Or do people do this?

Now, sometimes, I receive warnings on my phone, log in to the server only to find out that mySQL is down. I'll do a mysqlcheck and repair, which takes up to 2 hours or longer and finally quits with a working database. I start everything and all is happy again. I never find out why this happens, mostly though it happens when a blog writes about the site and people just go nuts and attack the site with sign-ups. But no detailed log on where it crashed and went down.

Other than rate-limiting the sign-up process (waiting line), is there anything I can do to make sure that whatever happens, MYSQL should NOT CRASH? Can I run sort of an auto-repair and optimze on a live instance hourly or so? I'd assume this blocks all access to the tables, which would be terrible?

I am really overwhelmed by this. I split up read/write and could theoretically split up all read access users to slave servers on EC2 instances. But then I have the problem of usage spikes going up and down dramatically and once I need a new EC2 instance, it requires me to transfer up to 2GB of data to sync the slave database ... which never works through the mysql-bin log if I decide to shut down / boot up an EC2 instance with a multiple day pause.

I have been able to keep up well up until know but even with EC2 and other technology at hand, I am not at the limit of my understanding and technical ability.

I'd love to share ALL the information required to make this a useful thread / document for later. Since not every website is a youtube/youporn/instagram/tumblr type of environment, I feel there's too little information for my type of site (high write/read, 500 to 5M records per user, at 3000-10000 users.

Thanks everyone, ask away and I will provide more information. I would love to hear your best-practices.

James Stone
  • 121
  • 1
  • Please post your `my.cnf` somewhere. What storage engine do you use, MyISAM or InnoDB? – Alex Aug 21 '12 at 12:25
  • MyISAM, my.cnf is here (is there a better text paste service?) http://pastebin.com/6nnKrsqT – James Stone Aug 21 '12 at 12:30
  • Is there anything useful in MySQL error log and slow log? Basically I would recommend you to convert everything to InnoDB but this implies a certain downtime which can be quite high (well, it should be comparable to the full check time). But if you use native MySQL full text search the conversion won't be possible because InnoDB does not support FTS. – Alex Aug 21 '12 at 12:37
  • i might give this a try, but it's really hard to convert and just wait for it crash again ... is there a reason why mysql crashes? there is enough disk space, disks are not the issue here at all. it must be a pure overload, i.e. one query running while others are bombarding the server to process more or something... – James Stone Aug 21 '12 at 12:51
  • I never use MyISAM in production especially for large setups because it's non-transactional and requires the recheck and repair procedure almost every time it crashes so I am not a big expert here. But there should be something relevant in the MySQL error log or maybe even in `/var/log/syslog`, many distributions log MySQL events right there. – Alex Aug 21 '12 at 13:01
  • so if a innodb crashes, it doesn't need a repair? or does it never crash? interesting to see your comments here, this is very helpful. have never been able to trace down the exact reason why the database crashed, i have the point in time, but never a reason ... i'll try to make copies of all logs next time and keep them for a review here. thanks so far. i'm gonna see if others want to chime on as well...but the innodb thing seems reasonable albeit not straight forward – James Stone Aug 21 '12 at 13:31
  • When InnoDB crashes it just replays transaction log records on restart, this is much faster than rechecking all the table. A number of transaction log records to replay after crash depends on write load only, not on the total DB size. – Alex Aug 21 '12 at 13:34
  • Sorry, where do you have a SOMEWHAT POWERFULL SERVER? You have a low end (for a database server) VM. A somehow powerfull server would be a double xeon with maybe 128gb of RAM these days - a not exactly top mid range model (going to 256gb memory). – TomTom Aug 25 '13 at 14:57

1 Answers1

1

I think your my.cnf is misconfigured regarding what have you presented in the comment. You are probably "giving" mysql far more RAM that your system has available. The thread_stack=100M is far greater that recomended. I'd bet that OOM-killer just kills your mysql to prevent kernel getting out of memory.

You should fist check your mysql configuration with mysqltuner and fine-tune your mysql config to avoid server crashes.

Running REPAIR, ANALYZE, OPZIMIZE,... on the production env on some cron basis regarding your big data is not recommended, but it would be a good practice to FLUSH TABLES now and then.

Breign
  • 106
  • 8