2

I'm new in Julia.

Using package DataFrames I created an empty dataframe.

using DataFrames

dt = DataFrame(x=Real[], y=Real[], w=Real[], z=Real[])
0×4 DataFrame
 Row │ x     y     w   z
     │ Real  Real  Real  Real
─────┴─────────────────────────

Now I want to fill this dataframe with zeros and then change the values via a "for". So, I tried

n = 10000
for i in 1:n
  push!(dt[i,:], [0])
end

but I get

ERROR: BoundsError: attempt to access 0×4 DataFrame at index [1, :]

How can I fill the dataframe with zeros?

Thanks in advance

LastBorn
  • 123
  • 8
  • uh I don't think this does what you think it does. When i equals 1, what do you expect to happen when running `dt[i,:]`? – Mark Aug 28 '23 at 14:38
  • when row is 1 it adds 0 in x, y, w, z. Am i wrong? – LastBorn Aug 28 '23 at 14:42
  • hmm I don't think so...it's accessing the 1st element of a thing with 0 elements, so it throws an error – Mark Aug 28 '23 at 14:43
  • if the dataframe already had 10000 rows, you could maybe do similar to that, but not if it has no rows – Mark Aug 28 '23 at 14:44
  • So, how i fill the dataframe? Manually is impossible because n is a variable – LastBorn Aug 28 '23 at 14:47
  • 1
    I think you're looking for something like `DataFrame(zeros(10000,4), [:x, :y, :w, :z])` – Mark Aug 28 '23 at 14:49
  • Now if I use " if pts > 1 i = 2 while i <= pts push!(df[i,:], [rand(Uniform(0, 10000)), rand(Uniform(0, 10000)), rand(Uniform(-2*pi, 2*pi)), 1]) i = i+1 end end" I Get ERROR: MethodError: Cannot `convert` an object of type Vector{Float64} to an object of type DataFrameRow{DataFrame, DataFrames.Index} – LastBorn Aug 28 '23 at 15:17
  • Have a look at [julia create an empty dataframe and append rows to it](https://stackoverflow.com/questions/26201005) and [Constructing Row by Row](https://dataframes.juliadata.org/stable/man/getting_started/#Constructing-Row-by-Row) . – GKi Aug 29 '23 at 06:42

2 Answers2

3

You can push rows to the DataFrame as follows:

julia> df = DataFrame(a=Float64[], b=Float64[])
0×2 DataFrame
 Row │ a        b       
     │ Float64  Float64 
─────┴──────────────────

julia> push!(df, (; a=1, b=2))
1×2 DataFrame
 Row │ a        b       
     │ Float64  Float64 
─────┼──────────────────
   1 │     1.0      2.0

julia> push!(df, [3, 4])
2×2 DataFrame
 Row │ a        b       
     │ Float64  Float64 
─────┼──────────────────
   1 │     1.0      2.0
   2 │     3.0      4.0

But much easier, and almost certainly more performant, is to just construct the DataFrame with 0‘s from the start:

julia> DataFrame(zeros(10000, 4), [:a, :b, :c, :d])
10000×4 DataFrame
   Row │ a        b        c        d       
       │ Float64  Float64  Float64  Float64 
───────┼────────────────────────────────────
     1 │     0.0      0.0      0.0      0.0
     2 │     0.0      0.0      0.0      0.0
     3 │     0.0      0.0      0.0      0.0
     4 │     0.0      0.0      0.0      0.0
     5 │     0.0      0.0      0.0      0.0
     6 │     0.0      0.0      0.0      0.0
     7 │     0.0      0.0      0.0      0.0
     8 │     0.0      0.0      0.0      0.0
   ⋮   │    ⋮        ⋮        ⋮        ⋮
  9994 │     0.0      0.0      0.0      0.0
  9995 │     0.0      0.0      0.0      0.0
  9996 │     0.0      0.0      0.0      0.0
  9997 │     0.0      0.0      0.0      0.0
  9998 │     0.0      0.0      0.0      0.0
  9999 │     0.0      0.0      0.0      0.0
 10000 │     0.0      0.0      0.0      0.0
                           9985 rows omitted
BallpointBen
  • 9,406
  • 1
  • 32
  • 62
  • using tuples also works: `push!(df, (3, 4))` :-) – Mark Aug 28 '23 at 14:51
  • Is possible if I use a for statement? – LastBorn Aug 28 '23 at 14:53
  • Yes, you can make the `push!` calls from inside a `for` loops, the point is that you have to push a whole row instead of just a single value. Also, if you know the number of rows beforehand, you should use the `zeros` method mentioned and create the DataFrame with that many number of rows right away, and then within the `for` loop use assignment (`df[!, i] = ...`) instead of `push!`. That would run much faster. – Sundar R Aug 28 '23 at 15:48
  • @SundarR Now if I use " if pts > 1 i = 2 while i <= pts push!(df[i,:], [rand(Uniform(0, 10000)), rand(Uniform(0, 10000)), rand(Uniform(-2*pi, 2*pi)), 1]) i = i+1 end end" I Get ERROR: MethodError: Cannot convert an object of type Vector{Float64} to an object of type DataFrameRow{DataFrame, DataFrames.Index} – LastBorn Aug 28 '23 at 15:50
  • If you've preallocated using `zeros`, then don't use `push!`. Instead overwrite those zeros with `while i <= pts df[i,:] .= (rand(Uniform(0, 10000)), rand(Uniform(0, 10000)), rand(Uniform(-2*pi, 2*pi)), 1) i = i+1 end`. (By the way, ignore the `df[!, i]` in my previous comment, `df[i, :] = ` is what you want here as shown in the loop above.) – Sundar R Aug 28 '23 at 17:12
2

From the comments, it seems like you're trying to accomplish something like this:

if pts == 1
  df = DataFrame(x = rand(Uniform(0, 10000)), 
                 y = rand(Uniform(0, 10000)), 
                 w = rand(Uniform(-2*pi, 2*pi)), 
                 z = 1)
elseif pts > 1
  n = pts - 1
  df = DataFrame(x = rand(Uniform(0, 10000), n), 
                 y = rand(Uniform(0, 10000), n), 
                 w = rand(Uniform(-2*pi, 2*pi), n), 
                 z = 1)
end

This way, you can initialize your DataFrame right away with the values you want, instead of zeros initializing them and then overwriting them later.

Sundar R
  • 13,776
  • 6
  • 49
  • 76