3

I need to query financial data from mongo and generate a daily Candle Stick graph.

collection schema is something like:

{
  symbol: 'GOOG',
  amount: 1000,
  rate: 123,
  created_at: ISODate('some point in time')
}

Each entry in the graph (i.e per a given day) should consist of 4 values:

  1. high (max(rate))
  2. low (min(rate))
  3. open (first rate in a given day)
  4. close (last rate in a given day)

Any idea how to build the query?

shaharsol
  • 991
  • 2
  • 10
  • 31

2 Answers2

2

OK, got this figured after @David Peleg sent me this repo:

db.collection.aggregate([
      {$match:{
        symbol:'GOOG'
      }},
      {$project:{
        "day":{
          "y": {"$year":"$created_at"},
          "m": {"$month":"$created_at"},
          "d": {"$dayOfMonth": "$created_at"}
        },
        created_at: 1,
        rate: 1
      }},
      {"$sort":{"created_at":1}},
      {"$group":{
        "_id": "$day",
        "created_at": {"$first":"$created_at"},
        "open": {"$first":"$rate"},
        "close": {"$last":"$rate"},
        "high": {"$max":"$rate"},
        "low": {"$min":"$rate"},
      }},
      {$project:{
        "_id": "$_id",
        "rates":{
          "open": "$open",
          "close": "$close",
          "high": "$high",
          "low": "$low"
        }
      }},
      {"$sort":{"_id":1}}
    ])
shaharsol
  • 991
  • 2
  • 10
  • 31
  • I have more or less the same query, but is this correct? The open price should the the close price of the previous candle, not the first occurrence within that time frame ? – redigaffi Jul 01 '22 at 04:30
1

I have been working on a similar query on a project I am developing and am glad to see I came to similar conclusions.

My incoming data is from Robinhood's API; it looks similar to yours, but gives me datestamps in strings, so I added a $toDate and $toString to flip string to Date objects. I may opt to convert the incoming datestamps into a Time object in Ruby and then remove that extra work here.

I am still refining my grouping since I'm not sure if I actually need two. My incoming stream sometimes duplicates data (per second, due to the string datestamp situation), so the first group is to clean that up to not falsely inflate the { "$sum": "$volume" } in the second group.

My $project returns my 1m candlestick of data within a 59 second range of the previous minute being requested. The keys match Robinhood's API historicals, so I can treat them as the same object in code.

var date_now   = new Date(new ISODate("2019-10-21T20:00:00.000Z"))
var date_end   = new Date(date_now.getTime() - 1000 * 1)
var date_begin = new Date(date_now.getTime() - 1000 * 60 * 10)
db.quotes.aggregate([
{ "$addFields": { "date": { "$toDate": "$time_pulled" } } },
{ $match: { symbol: "NUGT",
            date: { "$gte": date_begin,
                    "$lt":  date_end } } },
{ $group: { _id: "$date", // group by date to eliminate dups
    "symbol": { "$first": "$symbol" },
    "price":  { "$max": "$last_trade_price" },
    "volume": { "$max": "$volume" } } },
{ $sort:  { _id: 1 } },    // sort by date to get correct first/last prices
{ $group: { _id: "$symbol",
    //"date_begin":  { "$first": { "$toString": "$_id" } },
    //"date_end":    { "$last": { "$toString": "$_id" } },
    "high_price":  { "$max": "$price" },
    "low_price":   { "$min": "$price" },
    "open_price":  { "$first": "$price" },
    "close_price": { "$last": "$price" },
    "volume":      { "$sum": "$volume" } } },
{ $project: { _id: 1,
    begins_at: { "$toString": date_begin },
    volume: 1,
    high_price: 1,
    low_price: 1,
    open_price: 1,
    close_price: 1 } }
])

Result:

/* 1 */
{
    "_id" : "NUGT",
    "high_price" : "26.860000",
    "low_price" : "26.740000",
    "open_price" : "26.834200",
    "close_price" : "26.820000",
    "volume" : 392086.0,
    "begins_at" : "2019-10-21T19:50:00.000Z"
}
danarchy85
  • 11
  • 1
  • 2