18

In extension to my previous question

I can plot the Heat map with Seaborn very well and with suggestion can get annotation. But I see a new problem now.

Input File

Nos,Place,Way,Name,00:00:00,12:00:00
123,London,Air,Apollo,342,972
123,London,Rail,Beta,2352,342
123,Paris,Bus,Beta,545,353
345,Paris,Bus,Rava,652,974
345,Rome,Bus,Rava,2325,56
345,London,Air,Rava,2532,9853
567,Paris,Air,Apollo,545,544
567,Rome,Rail,Apollo,5454,5
876,Japan,Rail,Apollo,644,54
876,Japan,Bus,Beta,45,57
876,Japan,Bus,Beta,40,57
876,Japan,Bus,Beta,40,57

Program:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()

df = pd.read_csv('heat_map_data.csv')

df3 = df.copy()
for c in ['Place','Name']:
    df3[c] = df3[c].astype('category')

sns.heatmap(df3.pivot_table(index='Place', columns='Name', values='00:00:00' ),annot=True, fmt='.1f' )

plt.show() 
  1. If I take fmt='d' then I get error of float value and changed to fmt='f' And I get the count of the desired column.

But When the same axis value repeats it does not add the count from desired column. Any solution for that pls ?

As it is seen in the input file

876,Japan,Bus,Beta,45,57
876,Japan,Bus,Beta,40,57
876,Japan,Bus,Beta,40,57

It has 3 rows in repeat and the value of them should be shown as sum the cell which represents Japan and Beta should annot value as 125 instead it shows 41.7. How do I achieve that? Also is it possible to give two values as annotation ?

enter image description here

  1. Second doubt is now that in pivot I am giving value='00:00:00' but I need it to dynamically read the last column from the file.
Community
  • 1
  • 1
  • 1
    what do you mean by *dynamically read the last column from the file*? value is the second last column and you do read it in – Padraic Cunningham Jun 27 '15 at 12:39
  • 1
    The file is to be updated at every interval of time and an new column of counts is generated (Previous data processing part) and that new column is to be plotted after the data process in every interval of time. –  Jun 27 '15 at 12:42

1 Answers1

26

You can use the aggfunc keyword passing in a dict:

aggfunc :

function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

sns.heatmap(df3.pivot_table(index='Place', columns='Name', 
values='00:00:00',aggfunc={'00:00:00':np.sum}), annot=True, fmt='.1f')

Which outputs:

enter image description here

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 1
    @PadraicCunningham May I have a suitable solution for this http://stackoverflow.com/questions/31073889/operations-on-columns-multiple-files-pandas – Phani.lav Jun 27 '15 at 12:49