1

I have a pandas series say

import pandas as pd
a = pd.Series([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 3, 334],
    [333, 4, 5, 3, 4]
])

I want to find the largest element in all lists, which is 334, what is the easy way to do it?

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
william007
  • 17,375
  • 25
  • 118
  • 194

4 Answers4

2

Option 1
Only works if elements are actually list. This is because sum concatenates lists. This is also likely very slow.

max(a.sum())

334

Option 2
minimal two tiered application of max

max(map(max, a))

334

Option 3
Only works if all lists are same length

np.max(a.tolist())

334

Option 4
One application of max on an unwound generator

max(x for l in a for x in l)

334
piRSquared
  • 285,575
  • 57
  • 475
  • 624
1

This is one way:

max(max(i) for i in a)

Functional variant:

max(map(max, a))

Alternative method which only calculates one max:

from toolz import concat

max(concat(a))

For the fun of it below is some benchmarking. The lazy function concat and optimised map / list comprehension do best, then come numpy functions, pandas methods usually worse, clever sum applications last.

import numpy as np
from toolz import concat
import pandas as pd

a = pd.Series([list(np.random.randint(0, 10, 100)) for i in range(1000)])

# times in ms
5.92  max(concat(a))
6.29  max(map(max, a))
6.67  max(max(i) for i in a)
17.4  max(x for l in a for x in l)
19.2  np.max(a.tolist())
20.4  np.concatenate(a.values).max()
64.6  pd.DataFrame(a.values.tolist()).max().max()
373   np.max(a.apply(pd.Series).values)
672   max(sum(a,[]))
696   max(a.sum())
jpp
  • 159,742
  • 34
  • 281
  • 339
1

To dataframe

pd.DataFrame(a.values.tolist()).max().max()
Out[200]: 334

Or numpy.concatenate

np.concatenate(a.values).max()
Out[201]: 334

Or

max(sum(a,[]))
Out[205]: 334
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Yet another answer using np.max:

import numpy as np
np.max(a.apply(pd.Series).values)
Out[175]: 334
Allen Qin
  • 19,507
  • 8
  • 51
  • 67