2

Given a DataFrame an an expression, I would like to be able to subset the Dataframe using this expression. Also I would like to receive the index vector telling me which rows satisfy the conditions. I provide an example:

df = DataFrame(x1 = 1:3, x2 = [2, 1, 2],x3 = [22, 21, 20])
ex=:((x3 .< 22) & (x2 .== 2))

df1=df[(df[:x3].<22) & (df[:x2].==2),:]
idx=(df[:x2].==2) & (df[:x3].<22)

Is it possible to get df1 and idx using the Expression ex? I think that the "with" function of DataFrames did this once: idx=with(df,ex) Now there is https://github.com/JuliaStats/DataFramesMeta.jl ,however I cannot find the right function.

Thanks

user2546346
  • 145
  • 1
  • 1
  • 8
  • According to this discussion https://groups.google.com/forum/#!msg/julia-users/GU3Bxr0zM3g/GhW4iMt7m0gJ the approach `df[:((x3 .< 22) & (x2 .== 2)), :]` should work (or has worked at some point) Also, here https://github.com/JuliaLang/julia-tutorial/blob/master/DataFrames/slides.md a subset function is mentioned. But it does not seem to work on my installation and it neither returns the index vector which I would require – user2546346 Feb 27 '15 at 15:28
  • I tried out a few things with DataFramesMeta, but it does not work to my satisfaction. The last row produces an error. Also, I do not know how to get the index vector of the rows which match the expression. `exx=:((:x3.<22) & (:x2.==2)); out=@where(df,(:x3.<22) & (:x2.==2)); out2=@where(df,$(exx))$;` – user2546346 Mar 02 '15 at 08:27

1 Answers1

0

which() function can help fetch dataframe row indices that satisfy the conditions

Example : sel_ids <- which( x1 > 2 | x2 < 4) filtered_df <-df[sel_ids,]

zyduss
  • 91
  • 2