1

I need some help to solve this problem. I have a panda dataframe that has 101 unique columns[Column A to Column CW)]. I need to unpivot them into row-based output.

Original Data frame;

 HSP    CAT     AMK      COL   OPK      ROI     GIO     DOL     values
   0      0      0        0     0        0       0       0      0.365
   0      0      0        0     0        0       0       0      0.8454
   0      0      0        0     0        0       0       0      0.74654
   0      0      0        0     0        0       0       0      0.74654 

Output should be;

Industry   LGA   values
HSP        OPK   0.365
CAT        ROI   0.8454
AMK        GOI   0.74654
COL        DOL   0.74654

I have tested a function called melt() to unpivot 3 columns. I have tested it for the first time. I am not sure how to provide all column names dynamically within the same function. so that I don't have to inject every column name in that function. Here is my sample code;

df1= pd.melt(df, id_vars=['values'], value_vars= ['HSP', 'CAT','AMK', 'COL']) << same for another var called 'LGA'

Unfortunately its not working based on my requirement. I have columns (A to U) which should be captured in "Industry" and Column (v to CV) should be captured in "LGA" column. Not sure how to create two "value_bars" parameters. I know that there will be another column called "value" with 1 in each row. I can drop that column later. I am just trying to find a way provide all columns(A:CW) into two "value_vars" parameter section. Not sure how!

Tried this to impute the column to rows dynamically;

X_testP = ({"Industry": X_test.iloc[A:U].columns,
          "LGA": X_test.iloc[V:CV].columns,
          "values": X_test.iloc['values']})

Unable to get the desired output.

Any help would be appreciated.

Thanks in advance

adey27
  • 439
  • 3
  • 19
  • You could use list comprehension to generate a list of columns with whatever constraints you need. – Chris Jun 08 '21 at 01:36
  • Hi @Chris, thanks for your advice. I am unable to replicate that in my notebook. could you pls elaborate more? thanks – adey27 Jun 08 '21 at 01:56
  • Kindly add example dataframe with expected output. If you can, add code for input and output dataframe – sammywemmy Jun 08 '21 at 02:13
  • @adey27, apologies. I'm not sure I get it, what is the original data frame? it doesn't look like you shared the original data frame (the data frame before you melt). You don't have to share the entire columns from A to Z, a subset that accurately captures the whole would suffice – sammywemmy Jun 08 '21 at 04:14
  • @sammywemmy, I have added my revised sample code and df info. I am a newbie to this sort of transformations. hence unable to find out a solution for this. I am going to update my post again with an original dataframe. appreciate your help – adey27 Jun 08 '21 at 04:18

1 Answers1

2
>>> pd.DataFrame({"Industry": df.filter(like="Ind").columns,
                  "LGA": df.filter(like="LGA").columns,
                  "values": df["values"]})

  Industry   LGA   values
0    Ind 1  LGA1  0.36500
1    Ind 2  LGA2  0.84540
2    Ind 3  LGA3  0.74654
3    Ind 4  LGA4  0.74654
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • thanks for your advice. your suggestion will work based on my previous data frame which I have corrected now. My actual column names are unique and cant be filter them by their name. As stated earlier, I have 105 unique columns. against which I need to divide them into three columns. I tried the above code. but still not working. – adey27 Jun 09 '21 at 03:34
  • What is the shape of the final output? How many columns, how many rows? – Corralien Jun 09 '21 at 04:58
  • 105 columns and 1700 rows – adey27 Jun 09 '21 at 05:14
  • I mean the expected output: 3 columns (Industry, LGA, values)? Give an example with 4 rows and 10+1 columns, please. I want to know the value of the last row. – Corralien Jun 09 '21 at 05:21
  • Apologies. I'm not sure I get it, I have 105 columns in my expected output. values mentioned in my question is an example one – adey27 Jun 09 '21 at 05:40