6

I need to store daily stock closing prices as well as tick data in MongoDB. How would you design such a schema? For daily prices I would be tempted to have one document for each stock symbol, e.g.

{
    symbol: "AAPL",
    quotes: {
        {
           date: '2014-01-01',
           values: { open: 1, high: 1, low: 1, close: 1, volume: 100 }
        },
        {
           date: '2014-01-02',
           values: { open: 1, high: 1, low: 1, close: 1, volume: 100 }
        }, ...
    }
}

For tick data I could do something like the above with one subdocument per hour with an array of ticks.

However, considering the maximum document size is only 16MB I believe the limited would be reached very fast, especially for tick data.

I am aware of this approach http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb. Would that be a good approach? I.e. one document per symbol per day?

So, how would you design the schema for daily prices and tick data, respectively?

Morten
  • 1,819
  • 5
  • 28
  • 37
  • 1
    Hi can you please tell me the scheme that you finally used. – Karthik May 10 '17 at 17:19
  • 1
    I decided to use kdb+ instead. I don't think MongoDB is a good choice for tick data. – Morten May 11 '17 at 00:51
  • Can you please help me with the database schema you used. I wont be storing the entire days data instead. i ll just be storing the closing stock price . So for example AAPL will have only one record for a day .Thanks a lot for replying back – Karthik May 11 '17 at 06:51
  • Sorry, I didn't proceed with a an implementation. If you're just storing daily prices I assume { symbol: "AAPL", prices: [100, 101, 102] } would do. – Morten May 17 '17 at 12:21

1 Answers1

5

I think you are on the right track.

  • Having one document for each stock symbol will give you a good overview of all the symbols in the collection. And each document will have a fairly maintainable size.
  • In my opinion, if you are even close to 16MB on a single document, the schema-design is far from good enough. It's not easily readable or maintainable. You also have to fetch a whole lot of data each time you want anything from the document.
  • You mention "one docuement per symbol per day". To me that sounds like a sensible way to structure the data. Although i'm not familiar with the details in tick data from stocks, I supposed this will give you a good foundation for the schema design. You split it by each day, and can easily get all ticks for a given day/hour.
  • Remember, there is no absolute solution to schema-design, as long as you think through it thoroughly. (there is definitely a right/wrong way though) ;)
aludvigsen
  • 5,893
  • 3
  • 26
  • 37
  • Thanks. Say I'm monitoring 100 symbols and each symbol receive ~5000 ticks per day - would that be too much to store in a single document, assuming I'm using the one doc per symbol per day approach? However, if I later add options data, the volume would be much bigger. – Morten Apr 21 '14 at 13:31
  • 1
    It's difficult for me to say yes or no when I don't know the size of your objects. I think that if you stay way below the 16MB limit, you will be fine. But remember that if you want to interact with the data, very large documents will take longer to parse. – aludvigsen Apr 21 '14 at 13:39