I was working with a CSV file and I want to check for missing values in each column in that file using showcols but Julia Repl throws an error. Should I use any package for that
Asked
Active
Viewed 383 times
1
-
`Dataframes` maybe? – Oscar Smith Dec 07 '20 at 05:38
-
i was already included DataFrames package – uma_8331 Dec 07 '20 at 05:55
1 Answers
3
It seems that showcols
used to be a function in DataFrames
a very long time ago (I can find a mention of it in the docs for DataFrames v0.11 - the current release is v0.22).
Assuming that your data is indeed in a DataFrame, you can use describe
to get summary statistics, including the number of missing values.
julia> using DataFrames
julia> df = DataFrame(rand(2, 3), :auto);
julia> describe(df)
3×7 DataFrame
Row │ variable mean min median max nmissing eltype
│ Symbol Float64 Float64 Float64 Float64 Int64 DataType
─────┼──────────────────────────────────────────────────────────────────────
1 │ x1 0.614285 0.301365 0.614285 0.927204 0 Float64
2 │ x2 0.635276 0.588937 0.635276 0.681614 0 Float64
3 │ x3 0.235452 0.231867 0.235452 0.239037 0 Float64
Also, for a DataFrame as well as for many other tables you can iterate over columns and check for missings by e.g. doing
julia> (sum ∘ (x -> ismissing.(x))).(eachcol(df))
3-element Vector{Int64}:
0
0
0

Nils Gudat
- 13,222
- 3
- 39
- 60