Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

Question

I read a csv file into a pandas dataframe, and would like to convert the columns with binary answers from strings of yes/no to integers of 1/0. Below, I show one of such columns ("sampleDF" is the pandas dataframe).

In [13]: sampleDF.housing[0:10]
Out[13]:
0     no
1     no
2    yes
3     no
4     no
5     no
6     no
7     no
8    yes
9    yes
Name: housing, dtype: object

Help is much appreciated!

`sampleDF.housing.replace(('yes', 'no'), (1, 0), inplace=True)` — AChampion, Dec 01 '16 at 02:45
**Note:** Python has bools, and so does NumPy. Use them, not `0`/`1`, or `'0'`/`'1'`. — AMC, Feb 16 '20 at 21:16

score 123 · Answer 1 · answered Dec 01 '16 at 04:34

123

method 1

sample.housing.eq('yes').mul(1)

method 2

pd.Series(np.where(sample.housing.values == 'yes', 1, 0),
          sample.index)

method 3

sample.housing.map(dict(yes=1, no=0))

method 4

pd.Series(map(lambda x: dict(yes=1, no=0)[x],
              sample.housing.values.tolist()), sample.index)

method 5

pd.Series(np.searchsorted(['no', 'yes'], sample.housing.values), sample.index)

All yield

timing
given sample

timing
long sample
sample = pd.DataFrame(dict(housing=np.random.choice(('yes', 'no'), size=100000)))

answered Dec 01 '16 at 04:34

piRSquared

285,575
57
475
624

7

This is a great in depth answer. I wouldn't have even thought of some of these. – gold_cy Dec 01 '16 at 04:58
1

Can you do an entire dataframe full of yes and nos? I'm looking at this congressional vote dataset: http://archive.ics.uci.edu/ml/datasets/Congressional+Voting+Records – mLstudent33 Mar 13 '20 at 00:46

score 40 · Answer 2 · answered Dec 01 '16 at 02:46

40

Try this:

sampleDF['housing'] = sampleDF['housing'].map({'yes': 1, 'no': 0})

answered Dec 01 '16 at 02:46

gold_cy

13,648
3
23
45

3novak · Answer 3 · 2016-12-01T02:58:56.237

16

# produces True/False
sampleDF['housing'] = sampleDF['housing'] == 'yes'

The above returns True/False values which are essentially 1/0, respectively. Booleans support sum functions, etc. If you really need it to be 1/0 values, you can use the following.

housing_map = {'yes': 1, 'no': 0}
sampleDF['housing'] = sampleDF['housing'].map(housing_map)

edited Dec 01 '16 at 02:58

answered Dec 01 '16 at 02:45

3novak

2,506
1
17
28

score 8 · Answer 4 · answered Feb 23 '18 at 17:51

8

%timeit
sampleDF['housing'] = sampleDF['housing'].apply(lambda x: 0 if x=='no' else 1)

1.84 ms ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Replaces 'yes' with 1, 'no' with 0 for the df column specified.

answered Feb 23 '18 at 17:51

SriramKRaju

81
1
1

Freek Nortier · Answer 5 · 2019-03-19T21:18:55.190

5

Use sklearn's LabelEncoder

from sklearn.preprocessing import LabelEncoder

lb = LabelEncoder() 
sampleDF['housing'] = lb.fit_transform(sampleDF['housing'])

Source

edited Mar 19 '19 at 21:18

answered Mar 19 '19 at 20:59

Freek Nortier

780
1
13
27

score 4 · Answer 6 · answered Jan 08 '20 at 09:35

yes there is you can change yes/no values of your column to 1/0 by using following code snippet

sampleDF = sampleDF.replace(to_replace = ['yes','no'],value = ['1','0'])
sampleDF

by using first line you can replace the values with 1/0 by using second line you can see the changes by printing it

score 3 · Answer 7 · answered Aug 06 '18 at 08:23

3

Generic way:

import pandas as pd
string_data = string_data.astype('category')
numbers_data = string_data.cat.codes

reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.astype.html

answered Aug 06 '18 at 08:23

Siddaram H

1,126
12
17

score 3 · Answer 8 · edited Aug 10 '21 at 09:55

3

For a dataset names data and a column named Paid;

data = data.replace({'Paid': {'yes': 1, 'no': 0}})

all the yes will change to 1 and all the no will be replaced by 0

edited Aug 10 '21 at 09:55

Neuron

5,141
5
38
59

answered Aug 10 '21 at 08:19

muli

47
1
3

score 1 · Answer 9 · answered Jan 20 '19 at 02:21

1

You can convert a series from Boolean to integer explicitly:

sampleDF['housing'] = sampleDF['housing'].eq('yes').astype(int)

answered Jan 20 '19 at 02:21

jpp

159,742
34
281
339

score 1 · Answer 10 · answered Feb 17 '19 at 04:44

The easy way to do that use pandas as below:

housing = pd.get_dummies(sampleDF['housing'],drop_first=True)

after that drop this filed from main df

sampleDF.drop('housing',axis=1,inplace=True)

now merge new one in you df

sampleDF= pd.concat([sampleDF,housing ],axis=1)

score 1 · Answer 11 · answered May 28 '19 at 15:09

1

A simple and intuitive way to convert the whole dataframe to 0's and 1's might be:

sampleDF = sampleDF.replace(to_replace = "yes", value = 1)
sampleDF = sampleDF.replace(to_replace = "no", value = 0)

answered May 28 '19 at 15:09

Josmy

408
3
12

1

You can do this in a single line as well : `sampleDF = sampleDF.replace( to_replace = {"no" : 0, "yes" : 1})` – Shivam Shah Oct 18 '20 at 07:24

score 1 · Answer 12 · answered Aug 09 '20 at 19:23

1

sampleDF['housing'] = sampleDF['housing'].map(lambda x: 1 if x == 'yes' else 0)
sampleDF['housing'] = sampleDF['housing'].astype(int)

This will work.

answered Aug 09 '20 at 19:23

Benny06

31
4

score 1 · Answer 13 · answered Sep 28 '20 at 06:37

1

Try this, it will work.

sampleDF.housing.replace(['no', 'yes'], [0,1], inplace = True)

answered Sep 28 '20 at 06:37

Nija I Pillai

1,046
11
13

score 0 · Answer 14 · answered Sep 12 '18 at 14:06

0

Try the following:

sampleDF['housing'] = sampleDF['housing'].str.lower().replace({'yes': 1, 'no': 0})

answered Sep 12 '18 at 14:06

Sazzad

21
3

Turkey · Answer 15 · 2020-02-16T16:25:31.463

0

I used the preprocesssing function from sklearn. First you create an encoder.

e = preprocessing.LabelEncoder()

Then for each attribute or characteristic in the data use the label encoder to transform it into an integer value

size = le.fit_transform(list(data["size"]))
color = le.fit_transform(list(data["color"]))

It's converting a list of all the "size" or "color" attributes, and converting that into a list of their corresponding integer values. To put all of this into one list, use the zip function.

It is not going to be in the same format as the csv file; it will be a giant list of everything.

data = list(zip(buying, size))

Hopefully I explained that somewhat clearly.

edited Feb 16 '20 at 16:25

answered Feb 15 '20 at 00:09

Turkey

1
1

Isn't this a bit risky, like many uses of `string.replace()`, since it will replace the values **everywhere**? – AMC Feb 15 '20 at 00:51
Yeah, I found a better solution so I put that. I don't think I explained it very well though. – Turkey Feb 16 '20 at 16:26

score 0 · Answer 16 · answered Apr 07 '20 at 08:39

0

You can also try :

sampleDF["housing"] = (sampleDF["housing"]=="Yes")*1

answered Apr 07 '20 at 08:39

Şenol Kurt

31
4

DeepBlue · Answer 17 · 2020-06-05T09:59:56.283

0

This is just a bool to int.

Try this.

sampleDF.housing = (sampleDF.housing == 'yes').astype(int)

edited Jun 05 '20 at 09:59

answered Jun 04 '20 at 22:43

DeepBlue

415
4
9

score 0 · Answer 18 · answered Nov 06 '20 at 11:27

0

use pandas.Series.map

sampleDF.map({'yes':1,'no':0})

answered Nov 06 '20 at 11:27

Manikiran Bodepudi

59
1
10

score 0 · Answer 19 · answered Jun 17 '21 at 14:42

0

comprehension array

sampleDF['housing'] = [int(v == 'yes') for v in sampleDF['housing']]

answered Jun 17 '21 at 14:42

Tolstoïevski

1

Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

19 Answers19

Linked

Related