I have a data such as below:
x | x | y | y |
---|---|---|---|
0 | 1 | 2 | 3 |
I want to merge duplicate columns one below another row wise like this
x | y |
---|---|
0 | 2 |
1 | 3 |
Is there a way I can achieve this in pandas dataframe?
I have a data such as below:
x | x | y | y |
---|---|---|---|
0 | 1 | 2 | 3 |
I want to merge duplicate columns one below another row wise like this
x | y |
---|---|
0 | 2 |
1 | 3 |
Is there a way I can achieve this in pandas dataframe?
If the data is the result of a groupby / pivot on an existing dataframe @Pygirl comment is corrent and I refer to her/his link as Link
However it does not work if the data is to be read from file, as pandas does not allow for duplicate named columns and modifies them as name, name.1, name.n
.
My answer refers to that case with dummy input test_import
X,X,Y
1,2,1
3,4,2
5,6,3
It basically finds all associated columns x -> (X, X.1,..) Creates a dict with unique keys, and concatinates the associated columns together.
import pandas as pd
import re
df = pd.read_csv("test_import", sep=",")
pattern = r"([^\.]*)\.{0,1}.*"
unique_header = set(re.match(pattern, i).groups()[0] for i in df.columns)
result = dict()
for header in unique_header:
result[header] = []
for column in df.columns:
if column[:len(header)] == header:
result[header].append(df[column])
result[header] = pd.concat(result[header])
final = pd.DataFrame()
for key, value in result.items():
final[key] = value
print(final)
I do admit, that this is not a sexy answer and I am happy for input for a leaner way, but I didn't come up with one.
X Y
0 1 1
1 3 2
2 5 3
0 2 1
1 4 2
2 6 3