2

Let's say you have a data frame:

df = pd.DataFrame(columns = ['item'], index = ['datetime'])

You can add an item on a specific date index:

df.loc[pd.datetime(2015, 1, 15)] = 23

Is there any way I can add/append new items on the same index?

Disclaimer: I understand that the index is supposed to be unique and what I'm asking is not very panda-stic. But for some applications, especially with multiple indexes it provides an easy way to select chunks of data.

EDIT: Meanwhile I've found the append() function and it seems to do exactly that although it's kinda cumbersome. Look also here.

Community
  • 1
  • 1
sfotiadis
  • 959
  • 10
  • 24

3 Answers3

3

You could try:

df.groupby(df.index).sum()

This would group the rows with duplicate indices and then sum them up.

alacy
  • 4,972
  • 8
  • 30
  • 47
  • Thank you. I was referring to appending, not adding by aggregation. I've changed the question to reflect this. – sfotiadis Jan 15 '15 at 17:20
  • Are you trying to append the new item into a new column or as a new row with the same index? – alacy Jan 15 '15 at 17:27
  • As a new row with the same index. – sfotiadis Jan 15 '15 at 17:35
  • 1
    I believe using `append()` is your best bet. It's not the most efficient however since a new data frame is constructed each time. However, it will give you the desired result and allow you to query the samples with the same index as chunks of data as you described. – alacy Jan 15 '15 at 17:51
0

Meanwhile I've found the append() function and it seems to do exactly that although it's kinda cumbersome. Look also here.

sfotiadis
  • 959
  • 10
  • 24
0

I have tried many ways to do it, and the easiest with minimal error is to create a dataframe same as what you already have then use pandas.concat([maindata, add_data]) to push the "add_data" into the "maindata". Even if you have a duplicated indexes, it still will add the new row "add_data" to your main dataframe "maindata". try the below code.

import pandas as pd    
maindata = pd.DataFrame([[12, 13, 15], [200, 300, 400]], index=['serial1', 'serial2'], columns=['HP No', 'Company', 'Name'])
    add_data = pd.DataFrame([[5000, 6000, 7000]], index=['serial1'], columns=['HP No', 'Company', 'Name'])
    maindata = pd.concat([maindata, add_data])

I hope that it solves the issue. in case you want to have a professional way of sorting duplicated indexes, you can try to read about sort_index(inplace=True). GL

Ali Taheri
  • 116
  • 3