1

By position I mean:

let position:int = positionForKey frame key
let row =
  Frame.take positionForKey
  |> frame.takeLast 1

Then, row should be a Frame with only one row, whose key is key.

What I don't know is how to achieve positionForKey. One idea that should work but I don't know if it's the best way of doing it would be to create another Series via Series.scanValues and let the values be the positions, but I think there oughts to be a more elegant way of doing it.

The implementation via Series.scanValues would be:

let positionForKey (frame:Frame<'K,_>) (key:'K) =
  let positions = Series.scanValues (fun pos _ -> pos + 1) 0 (frame.GetColumnAt 0)
  positions.[key]

... index beginning from 1

Example

Say you have a Frame f like this:

03/01/01,  4 , ...
04/01/01,  3 , ...
05/01/01,  6 , ...
   ...  , ..., ...

then, positionforKey f 04/01/01 = 2, positionforKey f 05/01/01 = 3 and so on. (Supposing that 04/01/01 was a valid DateTime)

Lay González
  • 2,901
  • 21
  • 41

2 Answers2

3

Deedle actually has built-in function for doing this, but they are not very well documented (mostly because this has been changing quite a bit when we were adding support for "virtual frames").

However, consider a sample data frame:

let ts = series [ for i in 0 .. 365 -> DateTime(2017, 1, 1).AddDays(float i) => float i]
let df = frame ["Sample" => ts ]

The data frame has a row index which represents how the lookup using indices is performed. Using the RowIndex, you can locate the key and then translate the returned address to an index:

let addr = df.RowIndex.Locate(DateTime(2017, 5, 1))
let idx = df.RowIndex.AddressOperations.OffsetOf(addr)

And then you can get a frame with just this row:

df.GetRowsAt([| int idx |])

The address addr is just the index when you are working with in-memory data frames, but in virtual data frames it would be a number that encodes where the row is stored and so it would not directly map to an offset. That's why I added the OffsetOf call, which maps the address to an actual index. Though in case of in-memory frames, you do not need to worry about this.

If the key is not found, the addr value will be -1L (though in principle, you should use Addressing.Address.invalid when checking for this).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • 1
    ah, this is very nice, I saw the `Address` in the code and was wondering what it is and how `.RowIndex` could be used! – s952163 Jan 27 '17 at 14:05
1

You can extract the position of the key in several ways, for example using .RowIndex. But the simplest way is probably just get the keys and find the index. You might want to use TryFindIndex, where df is a dataframe, indexed by DateTime.

df.RowKeys |> Seq.findIndex(fun x -> x = DateTime(2017,5,6))

If you just want to return a row at the specified index, there is an extension method for that. Here are some ways to get at the row by index:

(Frame.getRow (DateTime(2017,5,6)) df):Series<string,string>

or

df.Rows.[(DateTime(2017,5,6))]

If you want to do something fancier you should certainly consult the Deedle, and Frame docs.

s952163
  • 6,276
  • 4
  • 23
  • 47
  • 2
    the findIndex method is more efficient, I'll change my definition for positionForKey, yet I still believe that there ought to be a better way. I skimmed through the docs and I didn't find something better, maybe I missed it. Thanks! – Lay González Jan 27 '17 at 05:38