4

I want to map over the values of the Title column of my dataframe. The solution I came up with is the following:

df.Columns.[ [ "Title"; "Amount" ] ]
|> Frame.mapCols(fun k s -> 
    if k = "Title" 
    then s |> Series.mapValues (string >> someModif >> box) 
    else s.Observations |> Series)

Since s is of type ObjectSeries<_> I have to cast it to string, modify it then box it back.

Is there a recommended way to map over the values of a single column?

Kimserey
  • 73
  • 1
  • 5

3 Answers3

3

Another option would be to add a TitleMapped column with:

df?TitleMapped <- df?Title |> Series.mapValues (...your mapping fn...)

...and then throw away the Title column with df |> Frame.dropCol "Title" (or not bother if you don't care whether it stays or not).

Or, if you don't like the "imperativeness" of <-, you can do something like:

df?Title 
|> Series.mapValues (...your mapping fn...)
|> fun x -> Frame( ["Title"], [x] ) 
|> Frame.join JoinKind.Left (df |> Frame.dropCol "Title") 
mt99
  • 589
  • 3
  • 13
  • A good option. Although this operator does not always work as expected. – FoggyFinder Apr 23 '16 at 22:12
  • For example, this code: http://pastebin.com/0kjKmXwz gives me an exception. Although I can't understand why. – FoggyFinder Apr 23 '16 at 22:32
  • Gah! This is annoying... Well, it appears the `?` infix operator only works for numerical-value columns. To make a string column work, using your example from pastebin (changed a bit to shorten, but crucial is the one with `.["Title"]`): `type nPerson = { Title:string; Amount:decimal } let npeopleRecds = [ { Title = "some"; Amount = 51M } { Title = "some1"; Amount = 28.9M } { Title = "some3"; Amount = 20M; } ] (Frame.ofRecords npeopleRecds |> Frame.cols ).["Title"] |> Series.mapValues(string >> (fun s -> s.ToUpper())) |> fun series -> series.Format() |> printfn "%A"` – mt99 Apr 23 '16 at 23:21
  • OK, all editing attempts of the above comment were in vain. Point I was trying the make is that for string-containing columns like `Title` we need to use `(df |> Frame.cols ).["Title"]`, or basically, your original answer. I live and learn. – mt99 Apr 23 '16 at 23:26
  • Thanks @mt99. Adding a column is what I ended up doing. Instead of using the `?` operator which is meant for `float` types, I used `frame.AddColumn("Label", frame |> Frame.getCol "Title" |> Series.mapValues someModif) ` – Kimserey Apr 25 '16 at 07:27
  • @ Kimserey, The "?" operator adds not only numeric data. So `df?Label <- ...` will work well with any types. – FoggyFinder Apr 25 '16 at 12:49
1

You can use GetColumn:

df.GetColumn<string>("Title")
|> Series.mapValues(someModif)

Or in more F#-style:

df
|> Frame.getCol "Title"
|> Series.mapValues(string >> someModif)
FoggyFinder
  • 2,230
  • 2
  • 20
  • 34
0

In some cases, you may want to map over values of a specific column and keep that mapped column in the frame. Supposing we have a frame called someFrame with 2 columns (Col1 and Col2) and we want to transform Col1 (for example, Col1 + Col2), what I usually do is:

someFrame
|> Frame.replaceCol "Col1"
    (Frame.mapRowValues (fun row ->
        row.GetAs<float>("Col1") + row.GetAs<float>("Col2"))
    someFrame)

If you want to create a new column instead of replacing it, all you have to do is to change the "replaceCol" method for "addCol" and choose a new name for the column instead of "Col1" of the given example. I don't know if this is the most efficient way, but it worked for me so far.

FRocha
  • 942
  • 7
  • 11