How to split a string into text and number using re library or any other method in an Excel sheet?

Question

I need to convert the first column of the Excel sheet into an integer value. Need to remove the string (say LP001005, remove LP and get the rest of the number).

I am able to achieve this on a single variable. But, I need to achieve this on the Excel sheet. I mean to convert the entire Excel into the dataframe in pandas and to extract the Loan_ID and do conversions (remove LP from LP001005) and then use the data frame.

>>> import re
>>> test_str = "Geeks4321"
>>> print("The original string is : " + str(test_str))
The original string is : Geeks4321
>>> res = [re.findall(r'(\d+)', test_str)[0] ]
>>> print("The tuple after the split of string and number : " + str(res))
The tuple after the split of string and number : ['4321']
>>>

The Excel sheet looks like below:

LoanID Name
LP1401 Shubhra
LP1102 Ankit
LP1203 Sowmya

score 0 · Answer 1 · answered Aug 17 '20 at 23:06

You can use the .extract() method to pull out the numeric part of the Loan ID:

df = pd.DataFrame({'LoanID': 'LP1401 LP2102 LP3203'.split(),
                  'Name': 'Shubhra Ankit Sowmya'.split()})

df['LoanID'] = df['LoanID'].str.extract( r'\w(\d+)', expand=False ).astype(int)

print(df)

   LoanID    Name
0    1401  Shubhra
1    2102    Ankit
2    3203   Sowmya

score 0 · Answer 2 · edited Mar 18 '21 at 21:33

0

Try this on Jupyter:

import pandas as pd
# open your excel file with pandas with the (read_excel) method:
f= pd.read_excel('Book1.xlsx',sheet_name='Sheet1')
# you may check the content of the first column:
for i in f.iloc[:,0]:
    print(i)
# check the headers names as objects:
f.columns.ravel()
# finally extract the numbers from the first column:
f['LoanID '].str.extract('(\d+)')

edited Mar 18 '21 at 21:33

marc_s

732,580
175
1,330
1,459

answered Aug 18 '20 at 00:42

kaMOON

105
10

How to split a string into text and number using re library or any other method in an Excel sheet?

2 Answers2