I am new to pandas and python and am trying to reshape data that was provided in a .csv. The data is structured such that the tickers with respective closes and dates are sequential in the same column.
For example:
TIC CLOSE
DATE
1984-01-03 0223B 25.37500
1984-01-04 0223B 25.75000
1984-01-05 0223B 25.75000
1983-12-30 0485B 21.37500
1984-01-03 0485B 21.37500
1984-01-04 0485B 22.50000
1983-12-30 0491B 17.75000
1984-01-03 0491B 17.50000
1984-01-04 0491B 17.62500
1983-12-30 3614B 74.25000
1984-01-03 3614B 73.25000
1984-01-04 3614B 76.00000
1993-07-01 3615B 47.25000
1993-07-02 3615B 47.25000
1993-07-06 3615B 46.40625
1983-12-30 3ABNKQ 4.75000
1984-01-03 3ABNKQ 5.00000
1984-01-04 3ABNKQ 5.62500
1983-12-30 3ACKH 55.25000
1984-01-03 3ACKH 54.50000
1984-01-04 3ACKH 55.25000
I want to reshape the data so that I'll have a pandas dataframe where each column will be a ticker with its respective closes and if there is no data for the date row key it will have NaN ('left' join).
I have tried something like this:
sp = pd.read_csv('D:\Stocks.csv')
sp = pd.DataFrame(sp)
sp.columns = ['TIC', 'DATE', 'CLOSE']
sp.index = pd.to_datetime(sp['DATE'])
sp = sp[['TIC', 'CLOSE']]
unique_tickers = sp['TIC'].unique()
s0 = sp[sp['TIC'] == unique_tickers[0]]
s0 = pd.DataFrame(s0['CLOSE'])
s1 = sp[sp['TIC'] == unique_tickers[1]]
s1 = pd.DataFrame(s1['CLOSE'])
s0s1 = pd.concat([s0, s1], axis = 1)
s0s1.columns = unique_tickers[0:2]
for i in range(len(unique_tickers)):
sx = sp[sp['TIC'] == unique_tickers[i]]
sx = pd.DataFrame(sx['CLOSE'])
s0s1 = pd.concat([s0s1, sx], axis = 1)
I guess I could hack the above code to get it to work, but I assume there is a much more elegant solution. Any ideas?
Thanks!
I got the solution to the original issues which was (thanks to BrenBarn):
sp = sp.reset_index().pivot(index='DATE', columns="TIC", values="CLOSE")
But when I ran it on a larger .csv I ran into the following bug,
ValueError: Index contains duplicate entries, cannot reshape
I tried to find a solution by trying to sp.groupby('TIC') then take all unique 'Date' row keys but it can figure out the syntax.. again any help would be appreciated!