2

I have some CSV data like

2011.12.08,22:45,1.33434,1.33465,1.33415,1.33419,265
2011.12.08,23:00,1.33419,1.33542,1.33419,1.33472,391
2011.12.08,23:15,1.33470,1.33483,1.33383,1.33411,420
2011.12.08,23:30,1.33413,1.33451,1.33389,1.33400,285

coming from Metatrader 4 in a file named EURUSD15.csv

I would like to import this file with Python using Pandas library and read_csv function...

So I did this :

#!/usr/bin/env python
from pandas import *
df = read_csv('data/EURUSD15.csv', header=None)
df.columns = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
print(df)

I would like now to have date/time parsed...

so I changed

df = read_csv('data/EURUSD15.csv', header=None)

to

df = read_csv('data/EURUSD15.csv', header=None, parse_dates=[[1, 2]])

But I get this error message

Exception: Length mismatch (7 vs 6)

How can I parse date and time columns and have the 2 columns considered as 1 "datetime" column.

piRSquared
  • 285,575
  • 57
  • 475
  • 624
Femto Trader
  • 1,932
  • 2
  • 20
  • 25

4 Answers4

1

The columns are zero indexed, so you need to do parse_dates=[[0,1]]
This is on latest version of pandas but should work with 0.8.0+:

In [74]: data = """\
2011.12.08,22:45,1.33434,1.33465,1.33415,1.33419,265
2011.12.08,23:00,1.33419,1.33542,1.33419,1.33472,391
2011.12.08,23:15,1.33470,1.33483,1.33383,1.33411,420
2011.12.08,23:30,1.33413,1.33451,1.33389,1.33400,285
"""

In [75]: pd.read_csv(StringIO(data), 
                     names=['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume'], 
                     index_col='Date_Time', parse_dates=[[0, 1]])
Out[75]: 
                        Open     High      Low    Close  Volume
Date_Time                                                      
2011-12-08 22:45:00  1.33434  1.33465  1.33415  1.33419     265
2011-12-08 23:00:00  1.33419  1.33542  1.33419  1.33472     391
2011-12-08 23:15:00  1.33470  1.33483  1.33383  1.33411     420
2011-12-08 23:30:00  1.33413  1.33451  1.33389  1.33400     285

Note the index_col=0 will also work. Complex date parsing prepends resulting columns so parse_dates will refer to pre-date processing column indices (i.e., 0 is Date and 1 is Time) and index_col refers to post-date processing column indices. Thus, using column names are recommended since it allows you to not have to think about pre-vs-post processing columns indices.

user3666197
  • 1
  • 6
  • 50
  • 92
Chang She
  • 16,692
  • 8
  • 40
  • 25
0

parse_dates doesn't take the index values.

Try something like:

pd.read_csv('data/EURUSD15.csv',  parse_dates = [['YYYY.MM.DD', 'HH:MM']], index_col = 0, 
        date_parser=parse)
user1523170
  • 393
  • 4
  • 9
  • Sorry but it doesn't work ! df = read_csv('data/EURUSD15.csv', header=None, parse_dates=[0]) works for the first column and df = read_csv('data/EURUSD15.csv', header=None, parse_dates=[1]) works for the second column but not df = read_csv('data/EURUSD15.csv', header=None, parse_dates=[[0, 1]]) – Femto Trader Jul 26 '12 at 17:01
  • when I try your code df = read_csv('data/EURUSD15.csv', header=None, parse_dates = [['YYYY.MM.DD', 'HH:MM']], index_col = 0, date_parser=parse) I get this error message NameError: name 'parse' is not defined So I tryed df = read_csv('data/EURUSD15.csv', header=None, parse_dates = [['YYYY.MM.DD', 'HH:MM']], index_col = 0, date_parser=True) and get this error message ValueError: could not broadcast input array from shape (15719) into shape (0) – Femto Trader Jul 26 '12 at 17:01
0
parse = lambda x: datetime.strptime(x, '%d-%m-%Y %H:%M')
df = pd.read_csv('data/EURUSD15.csv', parse_dates=[[0, 1]], date_parser=parse, index_col=[0],   header=None)
keys = ['Open', 'High', 'Low', 'Close','Volume']
df.columns = [x for x in keys]
yuran
  • 3
  • 3
0

easier method is using the drag and drop solution called pytrader, check it out on github.