1

Can anyone help me to extact the car model names from the following sample dataframe?

index,Make,Model,Price,Year,Kilometer,Fuel Type,Transmission,Location,Color,Owner,Seller Type
0,Honda,Amaze 1.2 VX i-VTEC,505000,2017,87150,Petrol,Manual,Pune,Grey,First,Corporate
1,Maruti Suzuki,Swift DZire VDI,450000,2014,75000,Diesel,Manual,Ludhiana,White,Second,Individual
2,Hyundai,i10 Magna 1.2 Kappa2,220000,2011,67000,Petrol,Manual,Lucknow,Maroon,First,Individual
3,Toyota,Glanza G,799000,2019,37500,Petrol,Manual,Mangalore,Red,First,Individual

I have used this code : model_name = df['Model'].str.extract(r'(\w+)')

How ever, i'm unable to get the car names which has names such as WR-V, CR-V ( or which has space or hyfen in between the names)

This is the detailed link of the dataset:https://www.kaggle.com/datasets/nehalbirla/vehicle-dataset-from-cardekho?select=car+details+v4.csv

Desired output should be:

index,0
0,Amaze
1,Swift
2,i10
3,Glanza
4,Innova
5,Ciaz
6,CLA
7,X1 xDrive20d
8,Octavia
9,Terrano
10,Elite
11,Kwid
12,Ciaz
13,Harrier
14,Polo
15,Celerio
16,Alto
17,Baleno
18,Wagon
19,Creta
20,S-Presso
21,Vento
22,Santro
23,Venue
24,Alto
25,Ritz
26,Creta
27,Brio
28,Elite
29,WR-V
30,Venue

Please help me!!

mozway
  • 194,879
  • 13
  • 39
  • 75
Gaurav
  • 13
  • 3

1 Answers1

0

The exact logic is unclear, but assuming you want the first word (including special characters) or the first two words if the first word has only one or two characters:

df['Model'].str.extract(r'(\S{3,}|\S{1,2}\s+\S+)', expand=False)

Output:

0            Amaze
1            Swift
2              i10
3           Glanza
4           Innova
5             Ciaz
6              CLA
7     X1 xDrive20d
8          Octavia
9          Terrano
10           Elite
11            Kwid
12            Ciaz
13         Harrier
14            Polo
15         Celerio
16            Alto
17          Baleno
18           Wagon
19           Creta
20        S-Presso
21           Vento
22          Santro
23           Venue
24            Alto
25            Ritz
26           Creta
27            Brio
28           Elite
29            WR-V
...            ...
Name: Model, dtype: object
mozway
  • 194,879
  • 13
  • 39
  • 75