1

Here is my task. Every second backgrounder task should generate json based on some data. This operation is not cpu intensive ( mostly network) and it generates JSON content (5-10KB). Operations take about 200ms.

Also I have about 1000 clients asking for this content once every few seconds. Let's say it's about 200 requests/sec.

Server should just output current actual json.

Currently i already have rails 4+nginx+passenger+debian sever doing other jobs, related to this work.

Being a student I want to make my server in a most cost-effective way having an ability to easy-scale in this ways:

  1. Adding few more backgrounder jobs, generating more json's
  2. Increasing number of requests to 10 000 per second

Currently I have linode 2048 ssd with 2 CPU Cores. My questions are:

  1. What gem/solution should I use for my backgrounder tasks ( the are currently written in ruby )
  2. How to effectivly store actual json and pass it from backgrounder(s) to rails/nginx.
  3. How to make serving json as fast as possible.
Archie Reyes
  • 527
  • 2
  • 14
Yegor Razumovsky
  • 902
  • 2
  • 9
  • 26
  • 1
    Do clients ask for similar type of content? What are background workers busy on? Does the content cacheable? What kind of data structure and storage you have? – Anatoly May 17 '15 at 21:13
  • clients asks a copy of last json created by json. You may think about it as: backgrounder creates a file => client downloads current file. Content cacheable, but every second backgrounder creates new actual file. Backgrounder asks external servers for data and concatenates it. I don'have any storage right now. – Yegor Razumovsky May 18 '15 at 10:56
  • 1
    Is that something like RSS approach? Does URL includes any date range params or user_id? Just trying to figure out the pattern you can apply to – Anatoly May 18 '15 at 12:02
  • it's even simplier then rss. It's like file server. Clients asks for files and background jobs update files. – Yegor Razumovsky May 18 '15 at 13:37
  • no need to store file history. only actual version required – Yegor Razumovsky May 18 '15 at 14:00

1 Answers1

1

you mentioned "Server should just output current actual json", I guess the JSON generation may not become a bottleneck as you can cache it to Memcache and serve Memcache directly:

1) Periodically background process -> dump data to Memcache (even gzip to speed it up)

2) User -> Nginx -> Memcache

See the Nginx memcache built in module http://nginx.org/en/docs/http/ngx_http_memcached_module.html

The bottleneck is any backend with blocking mechanism, GIL, IO locks etc, try to avoid these type of problems by split request/response cycle with intermediate Memcache data point.

Anatoly
  • 15,298
  • 5
  • 53
  • 77
  • memcached with dalli gem and loops gem made my day. Everything working smooth on my VM. Thank you – Yegor Razumovsky May 22 '15 at 09:48
  • @YegorRazumovsky sounds great! To reach 10K connections just bring more Nginx servers. An alternative option can be Rails response caching but sometimes centralised cache (aka Memcache) suits better. – Anatoly May 22 '15 at 13:23