One of F#'s claims is that it allows for interactive scripting and data manipulation / exploration. I've been playing around with F# trying to get a sense for how it compares with Matlab and R for data analysis work. Obviously F# does not have all practical functionality of these ecosystems, but I am more interested in the general advantages / disadvantages of the underlying language.
For me the biggest change, even over the functional style, is that F# is statically typed. This has some appeal, but also often feels like a straightjacket. For instance, I have not found a convenient way to deal with heterogeneous rectangular data -- think dataframe in R. Assume I'm reading a CSV file with names (string) and weights (float). Typically I load data in, perform some transformations, add variables, etc, and then run analysis. In R, the first part might look like:
df <- read.csv('weights.csv')
df$logweight <- log(df$weight)
In F#, it's not clear what structure I should use to do this. As far as I can tell I have two options: 1) I can define a class first that is strongly typed (Expert F# 9.10) or 2) I can use a heterogeneous container such as ArrayList. A statically typed class doesn't seem feasible because I need to add variables in an ad-hoc manner (logweight) after loading the data. A heterogeneous container is also inconvenient because every time I access a variable I will need to unbox it. In F#:
let df = readCsv("weights.csv")
df.["logweight"] = log(double df.["weight"])
If this were once or twice, it might be okay, but specifying a type every time I use a variable doesn't seem reasonable. I often deal with surveys with 100s of variables that are added/dropped, split into new subsets and merged with other dataframes.
Am I missing some obvious third choice? Is there some fun and light way to interact and manipulate heterogeneous data? If I need to do data analysis on .Net, my current sense is that I should use IronPython for all the data exploration / transformation / interaction work, and only use F#/C# for numerically intensive parts. Is F# inherently the wrong tool for quick and dirty heterogeneous data work?