How to sort a structured list of stock data for later access in Python?

Question

I am very new to Python and want to build a black box stock trading program that finds various correlations between stock's rates of return and gives me a response such as buy, sell, hold, etc. I found a neat little easy to use Python module for retrieving stock data called ystockquote that pulls information from Yahoo! Finance. The module can be found at http://www.goldb.org/ystockquote.html.

One of its abilities is to output historical prices for a stock in the form ['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Clos']. I can give it a date range to do this and it gives me a nested list containing a single list with the above information this for each day.

My question is how to organize each of these separate data points (Date, Open, High, Low, etc.) into a structure that I can call upon later in my script and sort. I need this process to be easy to automate. What sorts of algorithms or data structures might I find useful?

It is easy to sort a list or a dictionary in python. But you have to be more specific, so we can be more specific. You are not asking about how to sort a list, are you? — pajton, Jun 16 '11 at 21:39
You need to write yourself a `cmp` function which is an optional parameter in the sort function, which you can see if you do `help([].sort)` — inspectorG4dget, Jun 16 '11 at 21:41
I'm sorry I'm pretty new to this so I didn't think to include that this all needs to be automated before I edited it recently. To clarify I need code or a method that will take the output data from the ystockquote module and automatically sorts it into a dictionary like configuration so that I can call on each value (for example a specific dates's close price) from each day and receive a purely numeric response. So if on May 1st 2011 a stock is at $20.87 when I call on that I can get $20.87 as a response. — BlackBoxTrader, Jun 17 '11 at 00:25
Good clarification, but I think you can figure out how to do that bit on your own. I've got you 99% of the way... see below. — machine yearning, Jun 17 '11 at 01:08

machine yearning · Accepted Answer · 2011-06-17T21:44:10.357

You might be looking for a dictionary structure rather than a list:

>>> prices = dict()
>>> prices['2011-01-02'] = {'Open':20.00, 'High':30.00, 'Low':10.00, 'Close':21.00, 'Volume':14.00, 'Adj Clos':120}
>>> prices['2010-11-09'] = {'Open':22.00, 'High':50.00, 'Low':20.00, 'Close':42.00, 'Volume':10.00, 'Adj Clos':666}
>>> prices
{'2011-01-02': {'Volume': 14.0, 'Adj Clos': 120, 'High': 30.0, 'Low': 10.0, 'Close': 21.0, 'Open': 20.0}, '2010-11-09': {'Volume': 10.0, 'Adj Clos': 666, 'High': 50.0, 'Low': 20.0, 'Close': 42.0, 'Open': 22.0}}

Here I've nested a dictionary within each entry of the main "prices" dictionary. The first level of the dictionary takes the date as its key, and maps to a dictionary containing the price information for that date.

>>> prices['2011-01-02']
{'Volume': 14.0, 'Adj Clos': 120, 'High': 30.0, 'Low': 10.0, 'Close': 21.0, 'Open': 20.0}

The second level of the dictionary uses the attribute names as keys, and maps to the attribute values themselves.

>>> prices['2010-11-09']['Open']
22.0
>>> prices['2010-11-09']['Close']
42.0

It seems that, for the get_historical_prices function you refer to, each day is output as an entry of the form [Date, Open, High, Low, Close, Volume, Adj_Clos]. If you want to construct a dictionary for a list of these entries, you're gonna need to do three things:

First, you'll need to index each entry to separate out the Date from the other elements, since that is what you'll be using to index the first dimension of your dict. You can get the first element with entry[0] and the remaining elements with entry[1:].

>>> entry = ['2011-01-02', 20.00, 30.00, 10.00, 21.00, 14.00, 120]
>>> date = entry[0]
>>> date
'2011-01-02'
>>> values = entry[1:]
>>> values
[20.0, 30.0, 10.0, 21.0, 14.0, 120]

Second since you want to associate each of the other elements with a specific key, you should make a list of those keys in the same order as the data elements are given to you. Using the zip() function you can combine two lists p and q, taking the ith element from each and making zip(p,q)[i] == (p[i], q[i]). In such a way you create a list of (key, value) pairs that you can pass to a dictionary constructor:

>>> keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Clos']
>>> pairs = zip(keys, entry[1:])
>>> pairs
[('Open', 20.0), ('High', 30.0), ('Low', 10.0), ('Close', 21.0), ('Volume', 14.0), ('Adj Clos', 120)]

Finally you want to construct your dictionary, and index it into its appropriate date in the overall history:

>>> stockdict = dict(pairs)
>>> stockdict
{'Volume': 14.0, 'Adj Clos': 120, 'High': 30.0, 'Low': 10.0, 'Close': 21.0, 'Open': 20.0}
>>> histodict = dict()
>>> histodict[date] = stockdict

You can iterate through your nested history list to construct your dictionary in two ways, the first is using a traditional for loop:

keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Clos']
histodict = dict()
for item in history:
    date = item[0]
    values = item[1:]
    histodict[date] = dict(zip(keys, values))

Or if you want to play around with a slightly more advanced Python technique, try a nested dict generator statement:

keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Clos']
histodict = dict((item[0], dict(zip(keys, item[1:]))) for item in history)

That last one's a doozy if you're new to programming, but I encourage you to read up in that link; remember, when programming in Python, Google is your friend. I hope I've given you sufficient keywords and ideas here to get started learning, and I'll leave the rest up to you.

This is what I need but how do I automate it with my output data from ystockquote so I don't have to configure an entry manually for every single date? — BlackBoxTrader, Jun 17 '11 at 00:26
After some tinkering I figured out how to use this code the way I wanted. Thanks so much! You were a huge help and you made a great contribution to my learning. — BlackBoxTrader, Jun 17 '11 at 03:03

score 1 · Answer 2 · answered Jun 16 '11 at 21:59

Given a list of lists of equal length, its very easy to sort by any "column":

>>> l = [[1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [0, 0, 0, 0, 0], [6, 6, 6, 6, 6]]
>>> l.sort(key=lambda l: l[1])
>>> l
[[0, 0, 0, 0, 0], [1, 2, 3, 4, 5], [5, 4, 3, 2, 1], [6, 6, 6, 6, 6]]
>>> l.sort(key=lambda l: l[4])
>>> l
[[0, 0, 0, 0, 0], [5, 4, 3, 2, 1], [1, 2, 3, 4, 5], [6, 6, 6, 6, 6]]

The key keyword argument takes a function that, given an item in the list, returns a value which is used as the sort key.

But if you want to do more interesting things, you're probably better off using a database. Conveniently (for me), the sqlite3 docs use a table of stocks as an example, which I have appropriated and modified gently:

import sqlite3
conn = sqlite3.connect('/tmp/example')   # use ':memory:' for an in-memory db
c = conn.cursor()

# Create table
c.execute('''create table stocks
(date text, trans text, symbol text,
 qty real, price real)''')

# Insert a row of data
c.execute("""insert into stocks
          values ('2006-01-05','BUY','RHAT',100,35.14)""")

# Save (commit) the changes
conn.commit()

# Insert another row of data
c.execute("""insert into stocks
          values ('2006-01-07','SELL','RHAT',100,2.11)""")

# Select rows of data from table in an order
rows_by_date = c.execute("""select * from stocks order by date""")
for row in rows_by_date:
    print row

# In a different order
rows_by_price = c.execute("""select * from stocks order by price""")
for row in rows_by_price:
    print row

There is a [step-by-step](https://www.quantstart.com/articles/Securities-Master-Database-with-MySQL-and-Python) tutorial written by Michael Halls-Moore on how to setp-up a securities master database using MySQL, [Pandas](http://pandas.pydata.org/) and [SQLAlchemy](http://www.sqlalchemy.org/). I found it extremely clear and simple. — Enrico Pirani, Aug 13 '15 at 12:42

How to sort a structured list of stock data for later access in Python?

2 Answers2