0

In our application , in one of our microservice we will query the DB , get the result ( 100k rows ) and generate Excel using Apache POI.In couple of other services they also does the same process ( get DB rows and generate excel) . Here Excel generation process is common , IS this right design to separate this excel generation process as separate micorservice and use in all other services ? The challenge is passing the data ( 100k rows ) between microservices over HTTP . How can we achieve it ?

adorearun
  • 151
  • 1
  • 13

2 Answers2

0

You need to ask this question as to what define a service. Reading a chunks of data from a while, does this come under a service? When I think of separating my services I think along multiple lines like what this module needs to do. Who all will be using it, what all dependencies do I have, how I need to scale it up in future and above all. Which business team will be taking care of it. I tend to divide the modules based on the answers I get to these questions.

Here in your case I see this as less of a service and more of a utility function that can be put in a jar and shared across. A new service will be more along a line of say reporting service reading legacy excel files to create reports or migrating service which uses a utility to read excel.

Also there is no final answer you need to keep questioning your design unless you are happy with it.

Anunay
  • 1,823
  • 2
  • 18
  • 25
  • We also thought of using jar but that sounds like an older approach :) , my lead want me to come up with some new fancy approach ( meaning jargons) like caching those data and write to excel from there or save the data in BIG data like Mongodb and use it .So I just want to learn from experts. – adorearun Feb 27 '18 at 03:35
0

I personally never put the export feature as a separate service.

Providing such a table based data, I provide a table view of the data with paging, and also give export function as an octet streamed data without paging limit. Export could be a type of a view.

I've used the Apache POI library for report rendering but only for the small pages and complex shapes previously. POI also provides streaming version of workbook classes such as SXSSFWorkbook.

To be a microservice, it should have a proper reason to be a external system. If the system only provides just export something, negative. It's too simple and overkill. If you're considering to add versioning, permission, distribution, folder zipping, or... storage management, well.. that could be an option.

By the way, exporting such a big data into a file, Excel has max row limit to 1M size so you may hit the limit if your data size grow more. Why don't use use just a CSV format? Easy to use, Easy to jump, Easy to process.

tsohr
  • 865
  • 2
  • 15
  • 25
  • We are using the same streaming API , my question is if we separate the logic of get the rows and (only )write to excel as separate microservices ( is it right design ?) , if it is not the right design then we end writing same excel generation code is repeated across multiple application.If we fix some performance or bug fix then we have to fix in all places. So there lies the confusion in design. – adorearun Feb 27 '18 at 03:31
  • You could extract the column definition somewhere else and make abstract function to reuse the export procedure. As I noted, I'm negative. If you're suffering from the performance issue, but note that POI has its too much complexity inside so the overall performance is quite bad. so even if you decoupled it, still the problem follows. You could write a simple test code like this https://stackoverflow.com/a/8728267/592817 – tsohr Feb 27 '18 at 03:45