1

I have a dataframe with the following columns:

Name, Year, V1, V2, V5, V10, V12...

This Table contains about 40 Vx Variables. The values of these variables can be 1-5. I want to recode them so

1-3 = 0 and
4-5 = 1

I know how to replace data for one column like this

Table['V1_F'] = Table['V1'].apply(lambda x: 0 if x <4 else 1)

But I do not know how to apply this on multiple columns efficiently or is there now way around writing this replace code for each column? Best would be something like 'Do it for all columns except Name and Year.

Any help is welcome.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
ruedi
  • 5,365
  • 15
  • 52
  • 88

2 Answers2

3

Get all columns names to variable and compare for boolean mask, then convert True/False to 1/0 by casting to integers:

cols = Table.columns.difference(['Name','Year'])
Table[cols] = (Table[cols] >= 4).astype(int)

Or by numpy.where:

Table[cols] = np.where(Table[cols] < 4, 0, 1)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

Two possible solutions presented below

  • applymap if more sophisticated function required
  • your logic is binary, binary truth matrix and change back to integer representation
df = pd.DataFrame({**{"Name":np.random.choice(["this","that","other"],15),"Year":np.random.choice(range(1990,2021),15)},
             **{f"V{i}":np.random.randint(1,5,15) for i in range(10)}})

df2 = df.copy()
# solution 1
df.loc[:,[c for c in df.columns if c.startswith("V")]] = df.loc[:,[c for c in df.columns if c.startswith("V")]].applymap(lambda v: 0 if v<=3 else 1)
# solution 2
df2.loc[:,[c for c in df2.columns if c.startswith("V")]] = (df2.loc[:,[c for c in df2.columns if c.startswith("V")]]<=3).astype(int)

Name Year V0 V1 V2 V3 V4 V5 V6 V7 V8 V9
this 1998 0 1 0 0 1 0 0 0 0 0
that 2010 1 0 0 0 0 1 0 0 1 0
this 2004 0 0 0 0 1 0 0 1 0 0
this 1992 0 1 1 0 0 1 0 0 1 1
this 1990 0 0 1 0 0 0 0 0 0 1
this 2020 0 0 1 1 0 1 0 1 0 1
this 2016 0 1 0 0 0 0 1 0 1 0
other 1997 1 0 0 0 1 1 0 0 1 0
that 2000 1 0 1 0 0 1 1 0 0 0
that 2020 0 0 1 0 1 0 0 0 0 1
that 1991 0 0 0 0 0 0 1 0 0 1
other 2015 0 0 0 0 0 0 1 1 0 0
this 2020 0 0 0 1 0 0 0 0 0 0
other 2005 1 0 0 0 1 0 1 0 0 0
other 2008 1 0 0 0 0 0 1 0 0 0
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30