0

I have a dataframe, whose one particular column has temperature values like shown below

'35-40',
 '35-40',
 '40-45',
 '40-45',
 '45-50',
 '40-45',
 '40-45',
 nan,
 '40-45',
 nan,
 '40-45',
 '40-45',
 '35-40',

I am trying to create a new column separating minimum and maximum temperatures. In the rows filled with 'nan', I want the values after ',' to also be 'nan'. how should I do this? I have tried the code below but it didn't work.

train["Maximum Temperature"] = train["Cellar Temperature"].apply(lambda x: np.nan if train["Cellar Temperature"][0]==np.nan else (str(x).split("-")[1]))

Whenever I run the above code I get the following error

IndexError: list index out of range

Please help me.

  • Please provide the entire error message, as well as a [mcve]. What do you mean by _I want the values after ‘,’ to also be ‘nan’_ ? – AMC Mar 01 '20 at 14:38

3 Answers3

1

Try:

train[["Minimum Temperature", "Maximum Temperature"]]=train["Cellar Temperature"].str.split("-", expand=True, n=1)

str.split() will split string by provided delimiter - - in this case. Then expand will explode splitted array, so every element will go into separate column. Then n=1 will limit max splits to 1 (otherwise you would get an error, in case if you would have more than 1 hyphen in any cell).

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
1

You can use extract to get both:

df['temp'].str.extract('(?P<minimum>\d+)-(?P<maximum>\d+)')

Output:

   minimum maximum
0       35      40
1       35      40
2       40      45
3       40      45
4       45      50
5       40      45
6       40      45
7      NaN     NaN
8       40      45
9      NaN     NaN
10      40      45
11      40      45
12      35      40
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
0

To directly correct your code, try

train["Maximum Temperature"] = train["Cellar Temperature"].apply(lambda x: np.nan if pd.isnull(x) else x.split("-")[1])
CameLion
  • 106
  • 3