0

As I pull the date data from my excel file on my computer which is listed as: "10/1/10" - and stored in an array dData, and the numerical version of the date is stored in nData as: 734046, so when you call dData[0] it returns "10/1/10" and when you call nData it returns 734046.

HOWEVER

The code in bold as I pass in 10/1/10 it returns 735536, which is not the exact key-value pair that it should be organized chronologically.

import numpy as np
import pandas as pd 
import xlrd
import csv
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime
import time
import random
import statistics
import numpy
from numpy.random import normal
from scipy import stats

dData = [] #Date in string format - Month/Day/Year

pData = [] #Date in float format - Value.Decimals

nData = [] #Data in Dates in int - Formatted Date Data for plotting in Matpl

def loadData(dates, prices, numDates):

dateDictionary = {} # empty dictionary that will contain string dates to number dates

numDateToPrice = {} # empty dictionary that will contain number dates to string dates

nestedDictionary = {} # empty dictionary that will contain a nested dictionary str date : {numbertodate: price} 

with open('/Users/dvalentin/Code/IndividualResearch/CrudeOilFuturesAll.csv', 'rU') as csvfile: #This is where I pull data from an excel file on my comp
    reader = csv.reader(csvfile,  delimiter=',')
    for row in reader:
        dates.append(row[0])
        numDates.append(row[1])
        prices.append(row[2])

**for x in dates:
    for x in numDates:
        dateDictionary[x] = y 

  print dateDictionary**

for x in numDates[x]:
    for y in prices[y]:
        numDateToPrice[x] = y 

plt.plot_date(x=numDates, y=prices, fmt="r-")
plt.plot()
plt.title("Crude Oil Futures")
plt.ylabel("Closing Price") 
plt.grid(True)
plt.show()
BhushanK
  • 1,205
  • 6
  • 23
  • 39

1 Answers1

0
import pandas as pd
import datetime as dt

dates = ['10/1/10', '10/2/10','11/3/10','1/4/11']
prices = [12,15,13,18]


df = pd.DataFrame({'dates':dates,'prices':prices})
df = df.set_index(pd.DatetimeIndex(df['dates']))
df = df.drop('dates', axis = 1)

print df.ix['20101002']
print df['20101001':'20101002']
print df['2010']
print df['2010-10']

This seems to be a better way to organize your data instead of messing around with the numerical code for the date. You can always manipulate the datetimeindex for graphical parameters and style it out how you want. But this datetimeindex is much easier to manipulate data with instead of having to use dictionaries. More info on datetime indices: http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DatetimeIndex.html. Hope this helps!

  • But would I be able to graph this data still? Because in matplot dates, I believe, have to be first converted into a numerical python value that is readable for graphing? – pythonUser Feb 06 '15 at 16:16