-1

I have the following data in the excel sheet!enter image description here

I need to count the number of times a given elevation occurs for a given cover_type. For example, elevation=1905 occurs twice for cover_type=6 and once for cover_type=3. I need to do the same Aspect, Slope, Horizontal_Distance_To_Hydrology, Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways, Hillshade_9am, Hillshade_Noon, Hillshade_3pm, Horizontal_Distance_To_Fire_Points, Soil, Wilderness_Area.

I will be using the count to calculate the entropy of the each column. I need to execute this formula. enter image description here

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
novice_dev
  • 702
  • 1
  • 7
  • 22

1 Answers1

0

You can do the following

import pandas as pd
df = pd.read_csv('train_data.csv')
grouped = df[['elevation','cover_type']].groupby(['elevation','cover_type'], as_index = False, sort = False)['cover_type'].count()
Mostafa Mahmoud
  • 570
  • 5
  • 13
  • I tried you code and hit a snag :( `pandas.core.groupby.DataError: No numeric types to aggregate ` I googled and was not able to get the exact solution to this. Could you help? – novice_dev Mar 07 '15 at 18:06