Can I improve performance of my GCE small instance?

Question

I'm using cloud VPS instances to host very small private game servers. On Amazon EC2, I get good performance on their micro instance (1 vCPU [single hyperthread on a 2.5GHz Intel Xeon], 1GB memory).

I want to use Google Compute Engine though, because I'm more comfortable with their UX and billing. I'm testing out their small instance (1 vCPU [single hyperthread on a 2.6GHz Intel Xeon], 1.7GB memory).

The issue is that even when I configure near-identical instances with the same game using the same settings, the AWS EC2 instances perform much better than the GCE ones. To give you an idea, while the game isn't Minecraft I'll use that as an example. On the AWS EC2 instances, succeeding world chunks would load perfectly fine as players approach the edge of a chunk. On the GCE instances, even on more powerful machine types, chunks fail to load after players travel a certain distance; and they must disconnect from and re-login to the server to continue playing.

I can provide more information if necessary, but I'm not sure what is relevant. Any advice would be appreciated.

I'm hoping Google sees it first, so they can address the issue (if any, for all I know this is working as designed by Google). — JP Esteban, Dec 25 '14 at 10:50
You'd better to contact Google support for help. They know their environment better than anyone else. — BMW, Dec 25 '14 at 22:42
What are the other parameters of your instance, e.g., size of the disk, standard or SSD, etc.? On GCE, I/O performance is [proportional to disk size](http://stackoverflow.com/a/25496798/3618671), which may be affecting your test. — Misha Brukman, Dec 26 '14 at 20:39
Since you've upgraded to a more powerful machine and the issue still exists, I think it could be connections timeouts issue? Can you try modifying TCP keep-alive settings and let me know if it resolve the problem. Check out this link for the command: https://cloud.google.com/compute/docs/troubleshooting#communicatewithinternet — Kamran, Dec 27 '14 at 00:08
@MishaBrukman Thanks for pointing that out. I tried a quick experiment by starting an instance with a 30GB SSD instead of the default 8GB, upping my I/O performance 10x the previous setting. It didn't seem to have any noticeable effect, but I will take this into account for all my future instances so thank you again for showing this to me. — JP Esteban, Dec 29 '14 at 03:16
@Cloud I've just finished testing various keep-alive parameters (60, 300, 600 seconds) and none seem to have had any effect on the outcome. Thank you for the suggestion though. — JP Esteban, Dec 29 '14 at 04:21

score 0 · Accepted Answer · answered Dec 25 '14 at 18:47

Diagnostic protocols to evaluate this scenario may be more complex than you want to deal with. My first thought is that this shared core machine type might have some limitations in consistency. Here are a couple of strategies: 1) Try backing into the smaller instance. Since you only pay for 10 minutes, you could see if the performance is better on higher level machines. If you have consistent performance problems no matter what the size of the box, then I'm guessing it's something to do with the nature of your application and the nature of their virtualization technology.

2) Try measuring the consistency of the performance. I get that it is unacceptable, but is it unacceptable based on how long it's been running? The nature of the workload? Time of day? If the performance is sometimes good, but sometimes bad, then it's probably once again related to the type of your work load and their virtualization strategy.

Something Amazon is famous for is consistency. They work very had to manage the consistency of the performance. it shouldn't spike up or down.

Thank you for the strategies. 1.) I've done some quick tests with various instance sizes, both smaller and larger than my original instance, and the problem is indeed consistent - even with instances with twice the available CPU and memory. The problem may indeed lie within the server application, but I'm leaning more towards the nature of Google's virtualization technology being the culprit as the exact same program runs perfectly fine on Amazon's. 2.) The performance is consistent in that it is unacceptable from the moment I begin running the application minutes after creating an instance. — JP Esteban, Dec 29 '14 at 03:30

score 0 · Answer 2 · answered Jan 05 '15 at 19:23

0

My best guess here without all the details is you are using a very small disk. GCE throttles disk performance based on the size. You have two options ... attach a larger disk or use PD-SSD.

See here for details on GCE Disk Performance - https://cloud.google.com/compute/docs/disks

Please post back if this helps.

Anthony F. Voellm (aka Tony the #p3rfguy) Google Cloud Performance Team

answered Jan 05 '15 at 19:23

voellm

26
1

Hi Anthony, thanks for taking the time to respond. Another user - @MishaBrukman - pointed that out to me above. I [responded](http://stackoverflow.com/questions/27645700/can-i-improve-performance-of-my-gce-small-instance/27786383?iemail=1&noredirect=1#comment43780526_27645700) with the tests I ran using larger disks, and essentially nothing helped alleviate the issue. I have not yet given up on this, though, so if you need other details to offer more advice, please do let me know what information would be helpful to you. – JP Esteban Jan 08 '15 at 08:42

Can I improve performance of my GCE small instance?

2 Answers2