1

I'd like to use pivot_table to show an arbitrary value of a column in each cell. For example, given a DataFrame like this:

df = pd.DataFrame({'x': ['x1', 'x1', 'x2'],
                   'y': ['a', 'b', 'c']})

To count the values of y for each value of x:

df.pivot_table(index='x', values='y', aggfunc=len)
    y
x   
x1  2
x2  1

So in place of [2, 1], I'd like to get ['a', 'c'] or ['b', 'c'].

I tried these approaches, but all produce errors (notebook):

df.pivot_table(index='x', values='y', aggfunc=sample)
df.pivot_table(index='x', values='y', aggfunc=head)
df.pivot_table(index='x', values='y', aggfunc=lambda x: x[0])

Per https://stackoverflow.com/a/38982172/1840471, an alternative is using groupby and agg, and this produces the desired result in this case:

df.groupby(['x']).y.agg('head')

However, I'm looking to use pivot_table because my full use case involves getting values in rows and columns.

Max Ghenis
  • 14,783
  • 16
  • 84
  • 132

1 Answers1

1

How about using first as follows:

df.pivot_table(index='x', values='y', aggfunc='first')

Out[67]:
    y
x
x1  a
x2  c
Andy L.
  • 24,909
  • 4
  • 17
  • 29