Having a dataframe df
with columns :a
and :b
, how can I get all elements in column :a
that are in a row with e.g. b = 0.5
?
Can this be done with DataFrames
alone or is a meta package needed?
Asked
Active
Viewed 547 times
1 Answers
3
df[df.b .== 5, :]
Example
julia> df = DataFrame(a=11:17, b=vcat([5,5],1:5))
7×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 11 │ 5 │
│ 2 │ 12 │ 5 │
│ 3 │ 13 │ 1 │
│ 4 │ 14 │ 2 │
│ 5 │ 15 │ 3 │
│ 6 │ 16 │ 4 │
│ 7 │ 17 │ 5 │
julia> df[df.b .== 5, :]
3×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 11 │ 5 │
│ 2 │ 12 │ 5 │
│ 3 │ 17 │ 5 │
If you want just the column a
:
julia> df[df.b .== 5, :].a
3-element Array{Int64,1}:
11
12
17
Yet another option is to use filter
with a lambda function (this is slightly faster and uses less memory):
julia> filter(row -> row[:b] == 5, df)
3×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 11 │ 5 │
│ 2 │ 12 │ 5 │
│ 3 │ 17 │ 5 │

Przemyslaw Szufel
- 40,002
- 3
- 32
- 62
-
Nice answer. Could you please clarify the comment about ``filter`` being slightly faster? In a related answer, Bogumił Kamiński stated something a bit different (perhaps the context is different): https://stackoverflow.com/questions/58220143/julia-dataframe-select-rows-based-values-of-one-column-belonging-to-a-set – PatrickT Jun 03 '22 at 03:00
-
1My answer is without external packages. There are many ways to do the same thing in any language. – Przemyslaw Szufel Jun 03 '22 at 10:18