-1

I'm working with panel data and I am stuck in this situation:

enter image description here

What I want is to create a numeric ID (NumID) using Country as a reference, like this: enter image description here Can someone help me? Many thanks!

Marco Daniel
  • 5,467
  • 5
  • 28
  • 36
  • Some many different ways.. `pd.factorize`, scikit `LabelEncoder`, etc – rafaelc Jul 16 '19 at 00:46
  • *Please* don't post data / code as images. You should paste it directly into your question then use code formatting (the `{}` button). Also, you haven't included what you've already tried, and where you're stuck. – SiHa Jul 16 '19 at 07:40

2 Answers2

1

A few options:

groupby & ngroup

df['NumID_1'] = df.groupby('Country').ngroup() + 1

factorize

df['NumID_2'] = df['Country'].factorize()[0] + 1

Categorical

Depending on your needs, you may also look into using pandas' Categorical datatype:

df['NumID_3'] = df['Country'].astype('category')
  Country  Year Var1 Var2 Var3  NumID  NumID_1  NumID_2 NumID_3
0  Brazil  2000    A    B    C      1        1        1  Brazil
1  Brazil  2001    X    Y    Z      1        1        1  Brazil
2  Brazil  2002    F    F    H      1        1        1  Brazil
3  Brazil  2003    P    3    K      1        1        1  Brazil
4   Chile  2000    A    B    C      2        2        2   Chile
5   Chile  2001    X    Y    Z      2        2        2   Chile
6   Chile  2002    F    F    H      2        2        2   Chile
7   Chile  2003    P    3    K      2        2        2   Chile
Brendan
  • 3,901
  • 15
  • 23
0

try this to make num id from country :

import pandas as pd from pandas.api.types import CategoricalDtype

labels, uniques = pd.factorize(["Brazil","Brazil","Brazil","Brazil","Chile","Chile","Chile","Chile"])

print("Numeric Representation : \n", labels) print("Unique Values : \n", uniques) enter image description here