0

I am trying to make a program where I read multiple csv files in a directory. The files has been downloaded from http://www.nasdaqomxnordic.com/aktier/historiskakurser

The first row is sep= and it is skipped. The separator is ';'

The problem is that even though I get the data printed from all csv files, I get only blank plots.

The idea is to show a plot of data in column 6 with date as x-axis (column 0) for one csv file at a time and so on until the given directory is empty.

I would prefer the name of the csv file (paper) only as title. Now I get the directory/csv name.

It seems as matplotlib do not understand the csv file correct even though the data is printed.

My code looks as this:

import pandas as pd
#import csv
import glob
import matplotlib.pyplot as plt
#from matplotlib.dates import date2num
import pylab
#import numpy as np
#from matplotlib import style


ferms = glob.glob("OMX-C20_ScrapeData_Long_Name/*.csv")

for ferm in ferms:
    print(ferm)

# define the dataframe

    data = pd.read_csv(ferm, skiprows=[0], encoding='utf-8', sep=';',  header=0)

    print(data)
    data.head()

    pylab.rcParams['figure.figsize'] = (25, 20)
    plt.rcParams['figure.dpi'] = 80
    plt.rcParams['legend.fontsize'] = 'medium'
    plt.rcParams['figure.titlesize'] = 'large'
    plt.rcParams['figure.autolayout'] = 'true'
    plt.rcParams['xtick.minor.visible'] = 'true'

    plt.xlabel('Date')
    plt.ylabel('Closing price')
    plt.title(ferm)
    plt.show()

I have tried some other ways to open the csv files but the result is the same. No curves. Hope one of you experienced guys can give a hint.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Hennng
  • 19
  • 6

1 Answers1

0

I have made a few additions to your code. I downloaded a single file from the page you linked and ran the below code. Change your ferms and add the for loop back again. One reason why you weren't getting anything is because you haven't plotted the data anywhere. You have changed the aesthetics and everything but nowhere in your code you are telling python that you want to plot this data.

Secondly even if you add the plotting command it still wouldn't plot because neither of Date and Closing price are in numeric format. I change the Date column to datetime format. Your Closing price is a comma separated string. It could be representing a number either in thousands or maybe a decimal. I have assumed it is a decimal although its more likely to a number in thousands separated by a comma. I have changed it to numeric by using a self defined function called to_num using the apply method of pandas dataframe. It replaces the comma with a decimal.

import pandas as pd
#import csv
import glob
import matplotlib.pyplot as plt
#from matplotlib.dates import date2num
import pylab
#import numpy as np
#from matplotlib import style

ferm = glob.glob("Downloads/trial/*.csv")[0]

def to_num(inpt_string):
    nums = [x.strip() for x in inpt_string.split()]
    return float(''.join(nums).replace(',', '.'))

# print(ferm)
data = pd.read_csv(ferm, skiprows=[0], encoding='utf-8', sep=';',  header=0)
data['Date'] = pd.to_datetime(data['Date'])
data['Closing price'] = data['Closing price'].apply(to_num)

# print(data)
# data.head()
pylab.rcParams['figure.figsize'] = (25, 20)
plt.rcParams['figure.dpi'] = 80
plt.rcParams['legend.fontsize'] = 'medium'
plt.rcParams['figure.titlesize'] = 'large'
plt.rcParams['figure.autolayout'] = 'true'
plt.rcParams['xtick.minor.visible'] = 'true'
plt.xlabel('Date')
plt.ylabel('Closing price')
plt.title(ferm)
plt.plot(data.loc[:,'Date'], data.loc[:,'Closing price']) # this line plots the data
plt.show()

EDIT

Maintaining the same code structure as yours -

import pandas as pd
#import csv
import glob
import matplotlib.pyplot as plt
#from matplotlib.dates import date2num
import pylab
#import numpy as np
#from matplotlib import style

ferms = glob.glob("OMX-C20_ScrapeData_Long_Name/*.csv")

def to_num(inpt_string):
    nums = [x.strip() for x in inpt_string.split()]
    return float(''.join(nums).replace(',', '.'))

for ferm in ferms:
    data = pd.read_csv(ferm, skiprows=[0], encoding='utf-8', sep=';',  header=0)
    data['Date'] = pd.to_datetime(data['Date'])
    data['Closing price'] = data['Closing price'].apply(to_num) # change to numeric

    # print(data)
    # data.head()
    pylab.rcParams['figure.figsize'] = (25, 20)
    plt.rcParams['figure.dpi'] = 80
    plt.rcParams['legend.fontsize'] = 'medium'
    plt.rcParams['figure.titlesize'] = 'large'
    plt.rcParams['figure.autolayout'] = 'true'
    plt.rcParams['xtick.minor.visible'] = 'true'
    plt.xlabel('Date')
    plt.ylabel('Closing price')
    plt.title(ferm)
    plt.plot(data.loc[:,'Date'], data.loc[:,'Closing price'])
    plt.show()
Clock Slave
  • 7,627
  • 15
  • 68
  • 109
  • Thanks a lot, Now it is time to have some fun. – Hennng Aug 04 '17 at 11:41
  • @Hennng are you working on stock predictions or something related? – Clock Slave Aug 04 '17 at 11:42
  • I want to make a few codes to be a be to follow various stocks. Now when Yahoo seems to be off I try to use google and nasdaq. I found that the OMX-20 was easier to find with nasdaq. My idea is from a list of interesting stocks and papers to make stock curves in order to decide from the curves if I am interested in selling or buying. My bank had a pretty good program from SIX but they have skipped it as only a few was able to use it. Later I will try to play a little with some math as well but at the moment I think the eye is not too bad. :-) – Hennng Aug 04 '17 at 19:41
  • @Hennng Sounds interesting. I have been looking to do something similar with stock prices but using statistical models for forecasting using stock data. – Clock Slave Aug 04 '17 at 19:48