0

I have a data set that has the following columns.

data.columns[1:]
Index(['Fraud (i.e. fabricated or falsified results)',
       'Pressure to publish for career advancement',
       'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)',
       'Insufficient peer review of research',
       'Selective reporting of results',
       'Original findings not robust enough because not replicated enough in the lab publishing the work',
       'Original findings obtained with low statistical power/poor statistical analysis',
       'Mistakes or inadequate expertise in reproduction efforts',
       'Raw data not available from original lab',
       'Protocols, computer code or reagent information insufficient or not available from original lab',
       'Methods need 'green fingers' – particular technical expertise that is difficult for others to reproduce',
       'Variability of standard reagents', 'Poor experimental design',
       'Bad luck'],
      dtype='object')

And I want to use the columns to do the melt function, so I do the following code.

data_melt = pd.melt(data, id_vars =['respid'], value_vars =['Fraud (i.e. fabricated or falsified results)',
 'Pressure to publish for career advancement',
 'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)',
 'Insufficient peer review of research',
 'Selective reporting of results',
 'Original findings not robust enough because not replicated enough in the lab publishing the work',
 'Original findings obtained with low statistical power/poor statistical analysis',
 'Mistakes or inadequate expertise in reproduction efforts',
 'Raw data not available from original lab',
 'Protocols, computer code or reagent information insufficient or not available from original lab',
 "Methods need 'green fingers' – particular technical expertise that is difficult for others to reproduce",
 'Variability of standard reagents',
 'Poor experimental design','Bad luck'],var_name = 'factor', value_name = 'rate')

Basically, I just paste the column names into the value_vars.

My question is that is it possible to write a code to achieve the same thins?

For example, just write some code like below. (I know it is wrong.)

data_melt = pd.melt(data, id_vars =['respid'], value_vars = data.columns(), ,var_name = 'factor', value_name = 'rate')

Thanks!

Luke
  • 15
  • 2
  • 7

2 Answers2

0

Here's a solution:

# Create a dummy dataframe with columns similar to yours. 
df = pd.DataFrame({"respid": range(5),
                   "Fraud (i.e. fabricated or falsified results)": range(5,10), 
                   'Pressure to publish for career advancement': range(10, 15), 
                   'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)': range(15,20), 
                   'Insufficient peer review of research': range(20,25)
                  })

pd.melt(df, id_vars =['respid'], value_vars=set(df.columns).difference(["respid"]))

The result is:

    respid                                           variable  value
0        0       Fraud (i.e. fabricated or falsified results)      5
1        1       Fraud (i.e. fabricated or falsified results)      6
2        2       Fraud (i.e. fabricated or falsified results)      7
3        3       Fraud (i.e. fabricated or falsified results)      8
4        4       Fraud (i.e. fabricated or falsified results)      9
5        0               Insufficient peer review of research     20
6        1               Insufficient peer review of research     21
7        2               Insufficient peer review of research     22
8        3               Insufficient peer review of research     23
...
Roy2012
  • 11,755
  • 2
  • 22
  • 35
0

If data.columns[1:] are the values_vars you need, you just have to give it as argument :

data_melt = pd.melt(data, id_vars =['respid'], value_vars = data.columns[1:], ,var_name = 'factor', value_name = 'rate')
manu190466
  • 1,557
  • 1
  • 9
  • 17