2

I have a Dataframe which contains columns,

Col_1    Col_2

'0'         '-33#90#'
'-1#65#'       '0'
'90'        '-22#-44#90#250'

I want to get the min and max for the respective column values. I don't have any clue how to get that.

My output should be like,

Col_1_min  col_1_max   col_2_min   col_2_max
0           null         -33          90
-1           65           0           null
90          null         -44          250
halfer
  • 19,824
  • 17
  • 99
  • 186
venkat
  • 1,203
  • 3
  • 16
  • 37

1 Answers1

1

You can loop by columns in list with Series.str.extractall for get negative and positive integers, reshape by Series.unstack and convert to floats for numeric. Then get minimal and maximum values with Series.where for misisng values if same values:

cols = ['Col_1', 'Col_2']
for c in cols:
    df1 = df[c].str.extractall('([-]?\d+)')[0].unstack().astype(float)
    min1 = df1.min(axis=1)
    max1 = df1.max(axis=1)

    df[f'{c}_min'] = min1
    df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
      Col_1             Col_2  Col_1_min  Col_1_max  Col_2_min  Col_2_max
0       '0'         '-33#90#'          0        NaN        -33       90.0
1  '-1#65#'               '0'         -1       65.0          0        NaN
2      '90'  '-22#-44#90#250'         90        NaN        -44      250.0

If need remove original columns:

cols = ['Col_1', 'Col_2']
for c in cols:
    df1 = df.pop(c).str.extractall('([-]?\d+)')[0].unstack().astype(float)
    min1 = df1.min(axis=1)
    max1 = df1.max(axis=1)
    df[f'{c}_min'] = min1
    df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
   Col_1_min  Col_1_max  Col_2_min  Col_2_max
0          0        NaN        -33       90.0
1         -1       65.0          0        NaN
2         90        NaN        -44      250.0

EDIT:

Another solution with split:

cols = ['Col_1', 'Col_2']
for c in cols:
    df1 = df.pop(c).str.strip("'").str.split('#', expand=True)
    df1 = df1.apply(pd.to_numeric, errors='coerce')
    min1 = df1.min(axis=1)
    max1 = df1.max(axis=1)
    df[f'{c}_min'] = min1
    df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
   Col_1_min  Col_1_max  Col_2_min  Col_2_max
0        0.0        0.0      -33.0        NaN
1       -1.0        NaN        0.0        0.0
2       90.0       90.0      -44.0        NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/196496/discussion-on-answer-by-jezrael-extract-min-and-max-value-from-the-row-value-del). – Samuel Liew Jul 16 '19 at 00:59