15

How do I append one data frame to another, akin to SQL's union or R's rbind?

Say I have data frames A and B defined as follows.

A = DataFrame(x = [1, 2, 3], y = [4, 5, 6])
B = DataFrame(x = [4, 5, 6], y = [7, 8, 9])

One way to approach this would be as follows:

C = deepcopy(A)

for i = 1:size(B, 1)
    push!(C, Array(B[i,:]))
end

While this works, it feels a little hacky to me. Is there a better or more idiomatic way to do this?

Alex A.
  • 5,466
  • 4
  • 26
  • 56

3 Answers3

8

Array concatenation [A;B] is the simplest way to add rows of one DataFrame to another:

julia> A = DataFrame(x = [1, 2, 3], y = [4, 5, 6]);
julia> B = DataFrame(x = [4, 5, 6], y = [7, 8, 9]);
julia> [A;B]
6x2 DataFrames.DataFrame
| Row | x | y |
|-----|---|---|
| 1   | 1 | 4 |
| 2   | 2 | 5 |
| 3   | 3 | 6 |
| 4   | 4 | 7 |
| 5   | 5 | 8 |
| 6   | 6 | 9 | 
Reza Afzalan
  • 5,646
  • 3
  • 26
  • 44
  • 2
    This is perfect, thanks! Out of curiosity, do you know if this is documented somewhere? It works great but I can't seem to find it in the DataFrame docs. – Alex A. Dec 18 '15 at 04:20
  • Entering `?DataFrame` in REPL, will printout a list of useful methods for `DataFrame`. – Reza Afzalan Dec 18 '15 at 04:34
4

I had the same question. It turns out there is a more efficient way by using the append! function:

append!(A,B)

This modifies the original dataframe A. If you want to create a new dataframe, you can do:

C = deepcopy(A)
append!(C,B)

Note this solution is more efficient that doing C=vcat(A,B). Run the following code to observe memory allocation.

A = DataFrame(x = [1, 2, 3], y = [4, 5, 6])
B = DataFrame(x = [4, 5, 6], y = [7, 8, 9])

## method 1: deepcopy append!
@time let 
        C=deepcopy(A)
        append!(C,B)
end

## method 2: vcat
@time vcat(A,B)

## method 3: modifies A
@time append!(A,B)

I find respectively (27 allocations: 2.063 KiB), (78 allocations: 5.750 KiB) and (8 allocations: 352 bytes).

GuiWil
  • 141
  • 3
1

Also you can do vcat(A,B) to append two dataframes together.

If your dataframes are in an array then using the splat operator (...) like this vcat(AB...) would also work

xiaodai
  • 14,889
  • 18
  • 76
  • 140