2

exxeleron/qPython module allows to send the pandas DataFrame to kdb+/q's table.

Let's prepare the data:

import pandas.io.data as web
import datetime
import numpy

start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2015, 2, 6)
f = web.DataReader(["F","MSFT"], 'yahoo', start, end) # download stock data from Yahoo Finance
f = f.to_frame().reset_index() # flatten the MultiIndex to have a sym column, see below
f = f[["Date","minor","Close"]]
f.columns = ["dt","sym","val"] # just give comfortable names

The DataFrame object to be passed then looks like:

f.head()
# Out: 
#           dt   sym    val
# 0 2010-01-04     F  10.28
# 1 2010-01-04  MSFT  30.95
# 2 2010-01-05     F  10.96
# 3 2010-01-05  MSFT  30.96
# 4 2010-01-06     F  11.37

f.dtypes
# Out: 
# dt     datetime64[ns]
# sym            object
# val           float64    

When I try to send it to kdb+/q, I get the following error:

import qpython.qconnection as qconnection
q = qconnection.QConnection(host = 'localhost', port = 5000, pandas = True)
q.open()
q('set', numpy.string_('tbl'), f)

# File "G:\Anaconda\lib\site-packages\qpython\_pandas.py", line 159, in _write_pandas_series
#   data = data.fillna(QNULLMAP[-abs(qtype)][1])
# KeyError: -10
Daniel Krizian
  • 4,586
  • 4
  • 38
  • 75

1 Answers1

4

The sym column in the DataFrame is ambiguous for the qpython and it cannot determinate the default serialization properly. In such case, you have to provide type hinting for columns conversion, by setting the meta attribute:

from qpython import MetaData 
from qpython.qtype import QSYMBOL_LIST

f.meta = MetaData(sym = QSYMBOL_LIST)
q('set', numpy.string_('tbl'), f)

This instructs qpython to serialize sym column as q symbol list:

q)meta tbl                              
c  | t f a                              
---| -----                              
dt | p                                  
sym| s                                  
val| f 

q)tbl                                   
dt                            sym  val  
----------------------------------------
2010.01.04D00:00:00.000000000 F    10.28
2010.01.04D00:00:00.000000000 MSFT 30.95
2010.01.05D00:00:00.000000000 F    10.96
..

Alternatively, you can represent sym column as a q generic list containing strings. You can also apply type conversion to other columns:

from qpython import MetaData 
from qpython.qtype import QSTRING_LIST, QINT_LIST, QDATETIME_LIST

f.meta = MetaData(sym = QSTRING_LIST, val = QINT_LIST, dt = QDATETIME_LIST)
q('set', numpy.string_('tbl'), f)

which results in:

q)meta tbl                          
c  | t f a                          
---| -----                          
dt | z                              
sym|                                
val| i  

q)tbl                               
dt                      sym    val  
----------------------------------  
2010.01.04T00:00:00.000 "F"    10   
2010.01.04T00:00:00.000 "MSFT" 30   
..
Maciej Lach
  • 1,622
  • 3
  • 20
  • 27
  • Thank you for explanation, works like a charm. I also had to add `from qpython import MetaData` `from qpython.qtype import QSYMBOL_LIST` as per your previous post http://stackoverflow.com/questions/28385137/pandas-dataframe-drops-index-when-passing-to-kdb-using-qpython-api – Daniel Krizian Apr 09 '15 at 09:01
  • 1
    I've added missing imports to the snippet - thanks for spotting. In addition, I extended the docs with a type hinting information: http://qpython.readthedocs.org/en/latest/pandas.html#type-hinting – Maciej Lach Apr 09 '15 at 10:51