I have a pandas dataframe where I have a column values
like this:
0 16 0
1 7 1 2 0
2 5
3 1
4 18
What I want is to create another column, modified_values
, that contains a list of all the different numbers that I will get after splitting each value. The new column will be like this:
0 [16, 0]
1 [7, 1, 2, 0]
2 [5]
3 [1]
4 [18]
Beware the values in this list should be int
and not strings
.
Things that I am aware of:
1) I can split the column in a vectorized way like this
df.values.str.split(" ")
. This will give me the list but the objects inside the list will be strings. I can add another operation on top of that like this df.values.str.split(" ").apply(func to convert values to int)
but that wouldn't be vectorized
2) I can directly do this df['modified_values']= df['values'].apply(func that splits as well as converts to int)
The second one will be much slower than the first for sure but I am wondering if the same thing can be achieved in a vectorized way.