1

This may be a dumb question but I am a tad stuck with this. Say I have this dataframe:

   amount  amount_str  buy_order_id        id   midprice       price  
0  0.01545000  0.01545000     915692220  53424450  0.1286495  0.12947460   
0  1.65956203  1.65956203     915692330  53424458        NaN  0.12947460   
0  0.68427900  0.68427900     915692581  53424487        NaN  0.12947460   
0  0.22306417  0.22306417     915692632  53424491        NaN  0.12808629   
0  0.22306396  0.22306396     915692964  53424530        NaN  0.12808646   
0  2.31474046  2.31474046     915693081  53424535        NaN  0.12947460   
0  0.16808924  0.16808924     915694097  53424600        NaN  0.12808675   
0  5.30166589  5.30166589     915694819  53424629        NaN  0.12808710   
price_str  sell_order_id   timestamp  type  
0  0.12947460      915690988  1518045004     0  
0  0.12947460      915690988  1518045006     0  
0  0.12947460      915690988  1518045010     0  
0  0.12808629      915692647  1518045010     1  
0  0.12808646      915693012  1518045016     1  
0  0.12947460      915690988  1518045017     0  
0  0.12808675      915694117  1518045031     1  
0  0.12808710      915694862  1518045041     1

Here is my issue: each time the program gets a new order from bitstamp, it appends the values to this dataframe, and then adds the current midprice.

I could normally do that with df['midprice'] = value; however, this sets it for the entire column and not just that entry.

How would I make it so it adds it per line and not to the entire column?

Thanks!

cmaher
  • 5,100
  • 1
  • 22
  • 34
xxen0nxx
  • 87
  • 1
  • 5
  • 2
    "I could normally do that with `df['midprice'] = value`" this should never work as a method for appending to a dataframe. – roganjosh Feb 07 '18 at 23:18

1 Answers1

2

Pandas specialises in vectorised calculations.

So df['midprice'] = value sets an entire series (or column) to a fixed value.

It seems like you want to amend the last midprice value only, straight after a row is added by another process. This is one way you can achieve this:

df.iloc[-1, df.columns.get_loc('midprice')] = value
jpp
  • 159,742
  • 34
  • 281
  • 339
  • something like this was exactly what I was loking for, let me see if this works. I knew it had to do with one of the iloc, loc, or iat, at. – xxen0nxx Feb 07 '18 at 23:24
  • Is this inline btw? – xxen0nxx Feb 07 '18 at 23:26
  • @xxen0nxx, not sure what you mean by inline – jpp Feb 07 '18 at 23:33
  • hmm, im getting an error. AttributeError: 'Series' object has no attribute 'columns' – xxen0nxx Feb 07 '18 at 23:34
  • convention is `df` refers to dataframe. try it with the correct variable. – jpp Feb 07 '18 at 23:35
  • can you print `type(df)`? – jpp Feb 07 '18 at 23:40
  • sell_order_id timestamp type midprice 0 915882866 1518047205 0 0.12662148 amount amount_str buy_order_id id price price_str \ 0 0.5 0.50000000 915882917 53435169 0.12689309 0.12689309 – xxen0nxx Feb 07 '18 at 23:48
  • and the one after is – xxen0nxx Feb 07 '18 at 23:48
  • 915882866 1518047205 0 0.12662148 amount amount_str buy_order_id id midprice price \ 0 0.50000000 0.50000000 915882917 53435169 0.12662148 0.12689309 0 0.61658571 0.61658571 915883324 53435190 NaN 0.12570006 price_str sell_order_id timestamp type 0 0.12689309 915882866 1518047205 0 0 0.12570006 915884248 1518047221 1 – xxen0nxx Feb 07 '18 at 23:48
  • nope, that's not `type(df)`. if you can't print this, it's hard to help! – jpp Feb 07 '18 at 23:49
  • is the type – xxen0nxx Feb 08 '18 at 00:06
  • and now what do you get when you print `df.columns`? – jpp Feb 08 '18 at 00:07
  • This is what I get. – xxen0nxx Feb 08 '18 at 00:16
  • Index([ u'amount', u'amount_str', u'buy_order_id', u'id', u'midprice', u'price', u'price_str', u'sell_order_id', u'timestamp', u'type'], dtype='object') – xxen0nxx Feb 08 '18 at 00:16
  • so your error "AttributeError: 'Series' object has no attribute 'columns'" cannot be correct for the line of code i provided - pls post full error & traceback on your question. – jpp Feb 08 '18 at 00:26
  • hmm It wont let me paste it, to large. Here it is on pastebin. https://pastebin.com/cjTgzcXU – xxen0nxx Feb 08 '18 at 00:55
  • could be related: https://stackoverflow.com/questions/40366120/python-pandas-attributeerror-series-object-has-no-attribute-columns – jpp Feb 08 '18 at 00:58
  • Hmm, weird. What does not make sense to me is why is it saying series object when it is a dataframe. very strange. – xxen0nxx Feb 08 '18 at 01:12