0

Currently what I do is:

toConcat = []
for cname in get_columns:
    toConcat += [df[cname]]
res = pd.concat(toConcat, axis=0, ignore_index=True)
res = res.dropna()

While this works, I wonder if there are other, faster, built-in ways of dealing with this case? The reason I do this is because in different datasets I have different numbers of columns with related info that I want to merge into one column/row so that I can do frequency/mean calculations on them.

Again, thanks for all the support!

Cenoc
  • 11,172
  • 21
  • 58
  • 92
  • This seems fine to me as you are concatenating all your dfs in one go, if you did a join or merge you end up repeatedly joining/merging and each time you allocate space for the additional rows/columns. I don't know for instance if you could directly assign new columns to a master df something like `df['new_col'], df['another_col'] ... = other_df['new_col'], another_df['another_col']....` etc.. however this approach would require that the indices align which may not be true, in any case I think concat is appropriate – EdChum Oct 21 '14 at 15:10

2 Answers2

0

How about

pd.Series(df[toConcat].values.flatten())
bjonen
  • 1,503
  • 16
  • 24
0

without the flattening of @bjonen's answer, something like this:

pd.Series(map(str, df[toConcat].values))
acushner
  • 9,595
  • 1
  • 34
  • 34