I am trying to do pd.wide_to_long and I want to separate based off of what is inside a parenthesis. I looked up the pandas manual here and the closes example is below. I changed the column names to match something to what I have.
np.random.seed(0)
df = pd.DataFrame({'A(weekly_2010)': np.random.rand(3),
'A(weekly_2011)': np.random.rand(3),
'B(weekly_2010)': np.random.rand(3),
'B(weekly_2011)': np.random.rand(3),
'X' : np.random.randint(3, size=3)})
df['id'] = df.index
I would do
df1=pd.wide_to_long(df, stubnames=['A', 'B'], i=['id'], j='year', sep='()', suffix='\w+')
This returns an empty dataframe. I looked at this example and switched the i and J but it returns a KeyError: "None of [Index(['year'], dtype='object')] are in the [columns]" I have tried the following for sep= and still no luck. Does someone have an idea of how to get the following dataframe.
df1=pd.wide_to_long(df, stubnames=['A', 'B'], i=['id'], j='year', sep='(*)', suffix='\w+')
df1=pd.wide_to_long(df, stubnames=['A', 'B'], i=['id'], j='year', sep='\\()', suffix='\w+')
df1=pd.wide_to_long(df, stubnames=['A', 'B'], i=['id'], j='year', sep=r'\\()', suffix='\w+')
df1=pd.wide_to_long(df11, stubnames=['A', 'B'], i=['id'], j='year', sep=r'\(*\)', suffix='\w+')
The desired dataframe would be
desiredDF= pd.DataFrame({'year':["(weekly_2010)","(weekly_2010)","(weekly_2010)","(weekly_2011)","(weekly_2011)","(weekly_2011)"],
'A': np.random.rand(6),
'B': np.random.rand(6),
'X' : np.random.randint(6, size=6),
'id':np.random.randint(6, size=6)})
obviously with the same numbers, I just wanted to show the column that matters which is the year column. I need to do pivot because my dataframe is more complicated and doing other formats, messes up the numbers. If anyone knows how to write in the sep, I would greatly appreciate it!