I'm building an app to serve large amounts of data via REST API and I'm looking for some inputs on how to architecture it. I'm using .NET (C# 4.0), ASP.NET MVC and Sql Server 2008.
Right now I have about 400k rows in a relational database with +- 5% of it updated through the day by an internal app that goes directly to the database. I need to serve this data via a REST API returning a custom format of XML. However, the data needs to be processed before I can output it. The good thing is that I can pre-process it before if needed.
I wrote a small POC that gets the data, process it and cache it into a local XML file. Due to processing this process takes about an hour to work on all 400k rows. After the cache is done I just return the physical file in every request.
Now I need to be able to update the data as it gets updated in the source and update my cache so I don´t need to generate everything everytime a single row gets updated.
I'm thinking about using AppFabric to keep a memory cache and use physical files just to make sure that in case the memory cache goes out I don't need to start from scratch. As soon as a row gets updated in the source I would update the cache memory and write the physical file to make sure its up to date.
So my primary source would be the AppFabric cache, then the physical cache file and as a last resort regenerate the file from the database what would take about an hour and make the file unavailable to whoever calls it.
I'm not very happy with this, but it is what I got. Any suggestions?
Thanks a lot!