I am trying to read in the following file, and am having problems reading in the csv. The CSV file contains a lot of information at the top of the file prior to the header of the data. I have tried skiprows, and content to skip the stuff at the top of the file but it is not working.
Could someone offer a suggestion on how to read in this file?
Current program
import urllib
import pandas as pd
import StringIO
import datetime
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO as stio
else:
from io import StringIO as stio
myfile=[]
dls "http://www.spdrgoldshares.com/assets/dynamic/GLD/GLD_US_archive_EN.csv"
f = urllib.urlopen(dls)
myfile += f.readline()
TESTDATA=stio(myfile)
daily_prices = pd.read_csv(TESTDATA, sep=",", header=None, skiprows=13,
names=["Date", "GLD Close", "LBMA Gold Price", "NAV per GLD in Gold",
"NAV/share at 10.30 a.m. NYT", "Indicative Price of GLD at 4.15 p.m. NYT",\
"Mid point of bid/ask spread at 4.15 p.m. NYT","Premium/Discount of GLD mid
point v Indicative Value of GLD at 4.15 p.m. NYT",\
"Daily Share Volume","Total Net Asset Value Ounces in the Trust as at 4.15
p.m. NYT", "Total Net Asset Value Tonnes in the Trust as at 4.15 p.m. NYT",
"Total Net Asset Value in the Trust"])
Prior to table header on csv the following information is included on the file. I tried using skip rows, and content, but both did not work.
SPDR Gold Shares (New York Stock Exchange Arca),
"The "SPDR" trademark is used under license from The McGraw-Hill Companies, Inc. ("McGraw-Hill"). No financial product offered by SPDR" Gold Trust, or its affiliates is sponsored, endorsed, sold or promoted by McGraw-Hill."
"Note: This document is for information purposes only and is subject to change without notice. No part of this document may be reproduced in any manner without the written permission of SPDR Gold Shares spdrgoldshares@ssga.com. Under no circumstances should it be used or considered as an offer to sell or a solicitation of any offer to buy the securities or other instruments mentioned in it"
"Note: SPDR Gold Shares does not represent that this information is accurate or complete and it should not be relied upon as such. SPDR Gold Shares is not responsible for any loss, damage, expense or claim, howsoever arising, suffered as a result of reliance on the data contained within this file."
"Note: On dates where the LBMA Gold Price is not published the most recently available LBMA Gold Price is used."
"*Note: Since March 20, 2015, the Trust has been using the LBMA Gold Price PM as the price of gold in determining the value of the Trust's gold. Before that date, the Trust used the London PM Fix, which was discontinued on March 19, 2015. All references to LBMA gold price have been provided for informational purposes only. ICE benchmark administration limited accepts no liability or responsibility for the accuracy of the prices or the underlying product to which the prices may be referenced."