What is the best approach to build a system with high amount of data communication?

Question

Hello

I have a cache server (written with Java+Lucene Framework) which keeps large amount of data and provides them according to request query.

It basically works like this:

On the startup, it connects DB and stores all tables to the RAM.
It listens for requests and provides the proper data as array lists (about 1000 - 20000 rows)
When a user visits to the web page, it connects to the cache server, requests, and show the server response.

I planned to run web and cache applications in different instances because of memory issues. Cache Server is as service and web is on Tomcat.

What is your suggestion about how the communication should be built between web side and cache server ?

I need to pass large amount of data with array lists from one instance to another. Should I think web services (xml communication), nio socket communication (maybe Apache MINA) or the solutions like CORBA ?

Thanks.

score 1 · Answer 1 · answered Nov 22 '12 at 15:33

1

It really depends very much on considerations you have not specified.

What are the clients? for example, if your clients are javascript running AJAX, obviously something over HTTP is more useful than a proprietary UDP solution.
What network is it working on? Local networks behave differently than internet, and mobile internet is quite different than both.
How elaborate use can you make of caching? If you use HTTP you can have a rather good control (through HTTP headers) of both client cache and network caches, and a plethora of existing software that can make use of both.

There are many other considerations to be taken into account, and there are many existing implementations of systems matching the more-common needs. From your (not very detailed) description you gave, I would recommend having a look at Redis.

answered Nov 22 '12 at 15:33

onon15

3,620
1
18
22

There is a JSP that shows some categories and sub items. That page calls some methods and those methods return category and content data during the page load. No Ajax. Simply like that : jsp->(search query)->Java Interface->Cache Server THEN Cache Server->(result data)->Java Interface-> jsp. And yes, it is a local network. Both of them will run on the same machine. – anL Nov 22 '12 at 18:35
Is there any significant processing on the JSP side? Is separating the cache server and the JSP server actually dividing the processing between the two, or is it just to offload the burden of handling many connections off of the cache server? – onon15 Nov 22 '12 at 18:43
JSP side has to handle very high traffic. Even the resultsets become memory cost for JVM in case of this traffic. So it is decided to seperate them because of memory issues. Therefore I need a communication protocol between two different modules. – anL Nov 22 '12 at 20:47
If the reason the JSP is "heavy" is that it is receiving lightweight data from the cache and does a lot of processing on it, I would separate and use an HTTP-based protocol between them (as I expect the overhead of HTTP processing to be *relatively* low). I'd implement that as a Servlet of some sort, rather than use MINA (which is awesome, but programming a MINA server is an overkill for this need). And for the data serialization, I'd use JSON rather than CORBA or XML, as it's lightweight and easier to work with. – onon15 Nov 22 '12 at 21:02
If, on the other hand, the JSP processing is mere formatting of the cache's data, I would put it on the cache and put a large-memory HTTP cache (such as squid or trafficserver) in front of it. Otherwise would make no sense, as formatting the data to Corba or JSON on the cache server is not expected to be much different than the JSP. – onon15 Nov 22 '12 at 21:03
Actually, I need a proper way to send an object or object array between two different instances. For example sending an ArrayList consists of Content objects with 10 fields (id,name,intime,priority.. etc.) from the application on 127.0.0.1:8881 to the application on 127.0.0.1:8882. You suggest JSON but how would you send the large json strings to the other application ? This is why I'm little bit suspicious about using socket based solution for this problem. – anL Nov 23 '12 at 07:47
Obviously anything that is socket-based would require the resources for serialization/deserialization. Doing that with MINA-like NIO solution would reduce the resources wasted by context switching, but serialization and socket overheads would remain. What would be the alternative? Shared memory is risky because it can't scale up if you need to separate the machines (and the cost of implementation is high). – onon15 Nov 23 '12 at 08:14

What is the best approach to build a system with high amount of data communication?

1 Answers1