I have a dataframe that looks like this:
Col1 | Col2 | Col1 | Col3 | Col1 | Col4
a | d | | h | a | p
b | e | b | i | b | l
| l | a | l | | a
l | r | l | a | l | x
a | i | a | w | | i
| c | | i | r | c
d | o | d | e | d | o
Col1
is repeated multiple times in the dataframe. In each Col1
, there is missing information. I need to create a new column that has all of the information from each Col1
occurrence.
How can I create a column with the complete information and then delete the previous duplicate columns?
Some information may be missing from multiple columns. This script is also meant to be used in the future when there could be one, three, five, or any number of duplicated Col1
columns.
The desired output looks like this:
Col2 | Col3 | Col4 | Col5
d | h | p | a
e | i | l | b
l | l | a | a
r | a | x | l
i | w | i | a
c | i | c | r
o | e | o | d
I have been looking over this question but it is not clear to me how I could keep the desired Col1
with complete values. I could delete multiple columns of the same name but I need to first create a column with complete information.