I'm struggling to get my ahead around using the csv type provider in F# for simple data analysis tasks. I have done some googling around the 'Seq' function and the csv type provider as a whole but cant find resources relevant to my issue, so help is appreciated.
I'm attempting to use F# to create metrics on Horse Racing data (per each runner within a race). My data is in a csv and has a structure similar to this: raceId, runnerId, name, finishingPosition, startingPrice, etc
So what i want to do initially is group each csv row by raceId and create extra 'insights' on each race (An example here would be 'positionInBetting' using 'startingPrice' for each runner within the race).
this is what i have:
open FSharp.Data
type Runner = CsvProvider<Sample="runners.csv",AssumeMissingValues=true>
let dataset = Runner.Load("runners.csv")
let racesSince2010 = dataset.Rows |> Seq.filter (fun r -> r.Meeting_date.IsSome && r.Meeting_date.Value > new System.DateTime(2010,1,1)) |> Seq.groupBy (fun r -> r.Race_id)
So this achieves the first part of grouping runners by races and gives me seq of tuples where the key is the raceId and the value is a seq of Runners (I assume, but VS tells me it is actually a seq<CsvProvider<...>.Row>
)
then i expected this to work:
let raceDetails (raceId, runnersList:seq<Runner>) = runnersList |> Seq.iter ( fun r -> printfn "race: %i runner: %s" raceId r.)
but r.name isn't available in VS intellisense. I know i'm failing to understand why the output of my grouping function is defined as seq<CsvProvider<...>.Row>
instead of seq<Runner>
, but i cant find anything to explain it to me, or how to attack the problem i am having.
Alex