Proper way to merge DataFrame columns in Pandas?

Question

Currently what I do is:

toConcat = []
for cname in get_columns:
    toConcat += [df[cname]]
res = pd.concat(toConcat, axis=0, ignore_index=True)
res = res.dropna()

While this works, I wonder if there are other, faster, built-in ways of dealing with this case? The reason I do this is because in different datasets I have different numbers of columns with related info that I want to merge into one column/row so that I can do frequency/mean calculations on them.

Again, thanks for all the support!

This seems fine to me as you are concatenating all your dfs in one go, if you did a join or merge you end up repeatedly joining/merging and each time you allocate space for the additional rows/columns. I don't know for instance if you could directly assign new columns to a master df something like `df['new_col'], df['another_col'] ... = other_df['new_col'], another_df['another_col']....` etc.. however this approach would require that the indices align which may not be true, in any case I think concat is appropriate — EdChum, Oct 21 '14 at 15:10

score 0 · Answer 1 · answered Oct 21 '14 at 18:09

0

How about

pd.Series(df[toConcat].values.flatten())

answered Oct 21 '14 at 18:09

bjonen

1,503
16
24

score 0 · Answer 2 · answered Oct 21 '14 at 20:27

0

without the flattening of @bjonen's answer, something like this:

pd.Series(map(str, df[toConcat].values))

answered Oct 21 '14 at 20:27

acushner

9,595
1
34
34

Proper way to merge DataFrame columns in Pandas?

2 Answers2