0

I have column in a dataframe like this:

df = DataFrame(:num=>rand(0:10,20))

From df I want to make 2 others dataframe:

df1 = counter(df[!,:num)

To have the frequencies of each integer from 0 to 10. But I need the values sorted from 0 to 10:

0=>2
1=>3
2=>7

so on..

Then I want a new dataframe df2 where:

column_p = sum of occurrences of 9 and 10
column_n = sum of occurrences of 7 and 8
column_d = sum of occurrences of 0 to 6

I managed to get the first part, even though the result is not sorted but this last dataframe has been a challenge to my julia skills (still learning)

UPDATE 1

I managed to do this fucntion:

function f(dff)

@eachrow dff begin
    
    if     :num >=9
           :class = "Positive"
    elseif :num >=7
           :class = "Neutral"
    elseif :num <7 
           :class = "Negative"   
    end  
end
end

This function do half of what I want and fails if there's no :class column in the dataframe.

Now I want to count how many positive, neutral and negatives to do this operation:

(posivite - negative) / (negatives+neutral+positives)
pouchewar
  • 399
  • 1
  • 10

1 Answers1

1

The first part is:

julia> using DataFrames, Random

julia> Random.seed!(1234);

julia> df = DataFrame(:num=>rand(0:10,20));

julia> df1 = combine(groupby(df, :num, sort=true), nrow)
10×2 DataFrame
 Row │ num    nrow
     │ Int64  Int64
─────┼──────────────
   1 │     0      1
   2 │     2      2
   3 │     3      2
   4 │     4      2
   5 │     5      1
   6 │     6      2
   7 │     7      2
   8 │     8      4
   9 │     9      1
  10 │    10      3

I was not sure what you wanted in the second step, but here are two ways to achieve the third step using either df1 or df:

julia> (sum(df1.nrow[df1.num .>= 9]) - sum(df1.nrow[df1.num .<= 6])) / sum(df1.nrow)
-0.3

julia> (count(>=(9), df.num) - count(<=(6), df.num)) / nrow(df)
-0.3
Bogumił Kamiński
  • 66,844
  • 3
  • 80
  • 107