23

I have a pandas dataframe with 10 rows and 5 columns and a numpy matrix of zeros np.zeros((10,3)).

I want to concat the numpy matrix to the pandas dataframe but I want to delete the last column from the pandas dataframe before concatenating the numpy array to it.

So I will end up with a matrix of 10 rows and 5 - 1 + 3 = 7 columns.

I guess I could use

new_dataframe = pd.concat([
    original_dataframe,
    pd.DataFrame(np.zeros((10, 3)), dtype=np.int)
], axis=1, ignore_index=True)

where original_dataframe has 10 rows and 5 columns.

How do I delete the last column from original_dataframe before concatenating the numpy array? And how do I make sure I preserve all the data types?

cs95
  • 379,657
  • 97
  • 704
  • 746
Jamgreen
  • 10,329
  • 29
  • 113
  • 224
  • 2
    you can slice the original df `new_dataframe = pd.concat([original_dataframe.ix[:, :-1], pd.DataFrame(np.zeros((10, 3)), dtype=np.int)], axis=1, ignore_index=True)` with regards to your last comment aren't the datatypes preserved anyway? – EdChum Sep 26 '16 at 08:56
  • `ix` is deprecated now, so consider using `iloc` or `loc`. See [my answer](https://stackoverflow.com/a/53821216/4909087) below. – cs95 Dec 18 '18 at 04:34

1 Answers1

22

Setup

np.random.seed(0)
df = pd.DataFrame(np.random.choice(10, (3, 3)), columns=list('ABC'))
df

   A  B  C
0  5  0  3
1  3  7  9
2  3  5  2

np.column_stack / stack(axis=1) / hstack

pd.DataFrame(pd.np.column_stack([df, np.zeros((df.shape[0], 3), dtype=int)]))
    
   0  1  2  3  4  5
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0

Useful (and performant), but does not retain the column names from df. If you really want to slice out the last column, use iloc and slice it out:

pd.DataFrame(pd.np.column_stack([
    df.iloc[:, :-1], np.zeros((df.shape[0], 3), dtype=int)]))

   0  1  2  3  4
0  5  0  0  0  0
1  3  7  0  0  0
2  3  5  0  0  0

pd.concat

You will need to convert the array to a DataFrame.

df2 = pd.DataFrame(np.zeros((df.shape[0], 3), dtype=int), columns=list('DEF'))
pd.concat([df, df2], axis=1)
 
   A  B  C  D  E  F
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0

DataFrame.assign

If it's only adding constant values, you can use assign:

df.assign(**dict.fromkeys(list('DEF'), 0))

   A  B  C  D  E  F
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0
Community
  • 1
  • 1
cs95
  • 379,657
  • 97
  • 704
  • 746