Looping through a list of object/values, remove duplicates, and return unique value in View (python)

Question

I am trying to better understand and put into action some programming concepts, specifically looping and recursion (or iteration??) through a list that contains values. I have an api that retrieves a list from the database and prints a table entity and its elements (e.g. [<Assessment(name='teaching', text='something', microseries=3, subseries='3a', etc...)>, <Assessment(name='learning', text='foo', microseries=3, subseries='3b', etc...)>]). There are 1-5 microseries in this list.

Problem: remove duplicate microseries so the output (return) is only a single microseries (e.g. only one microseries= 3 and not all the microseries under 3 which is specified under the name subseries: 3a, 3b, 3c, 3d). The goal here is to display a single microseries (it's only one so don't let the plural name here confuse you) in a template so that a user can click on it and go into an expanded microseries (subseries) view (3a, 3b, 3c ....).

Table arrangement:
Assessment
|--name
|--text
|--microseries
|--subseries

I am sure there might be a better approach to this and I am all ears to some recommendations. I am a newbie and could use some wise direction on how to tackle this problem.

I am currently using code suggested on former Stacks questions 1, 2 Please help :) I would like to better understand the concepts behind the approach in dealing with iteration over a list and removing duplicates and returning only a unique value per object in a list. Please excuse my lack of tech speak.

I am using Python 2.7, SQLAlchemy and Pyramid (for the web framework)

view.py (view front-end code)

@view_config(route_name='assessments', request_method='GET', renderer='templates/unique_assessments.jinja2')
def view_unique_assessments(request):
    # other code
    all_assessments = api.retrieve_assessments()
    #print 'all assessments', all_assessments

    new_list = list(set(all_assessments)) #removes duplicates
    new_list.sort() #sorts items in list
    print 'new list', new_list # sorted and unique list

    for x in new_list:
        print 'test', x #prints test <Assessment(name='Foo', text='this is 1A', microseries='1', etc...)>    

        micro = set([x.microseries]) #doesn't work 
        print 'test micro single print', micro #doesn't iterate over the list and print out each unique microseries -- only prints one
        #prints: test micro single print set([3]) instead of 1,2,3,4,5

    return {'logged_in': logged_in_userid, 'unique_microseries': micro}

Database Table:

class Assessment(Base):
    __tablename__ = 'assessments'

    id = Column(Integer, primary_key=True)
    name = Column(String(50), unique=True)
    text = Column(String(2000))
    microseries = Column(Integer)
    subseries = Column(String(50))
    created_on = Column(DateTime, default=datetime.utcnow)

API:

def retrieve_assessments(self):
    assessments = self.session.query(Assessment).order_by(Assessment.id).all()
    return assessments

Can you add model code to the question as otherwise it's a bit guess work what is happening and what should be happening? — Mikko Ohtamaa, Feb 16 '16 at 22:39
It sounds like easiest approach is to have a dictionary which contains `{microseries.id: microseries instance}` pairs and then just return `dict.values()` — Mikko Ohtamaa, Feb 16 '16 at 22:40
Hi @MikkoOhtamaa. I added more code that I believe might help everyone. — thesayhey, Feb 16 '16 at 23:52

score 2 · Accepted Answer · edited Feb 17 '16 at 19:45

2

The usual approach, as @Mikko suggests, is to use a dict (or sometimes a set) to keep track of which items you already saw during the iteration - if the item is already in the dict you just skip and go to the next one. Then you use .values() method to get the, err, values of the dict.

def view_unique_assessments(request):
    all_assessments = api.retrieve_assessments()
    assessments_by_microseries = {}

    for x in all_assessments:     
        if x.microseries in assessments_by_microseries:
            print("Already seen this microseries: %s" % x.microseries)
        else: 
            assessments_by_miniseries[x.microseries] = x

    unique_assessments = sorted(assessments_by_microseries.values())     
    return {'logged_in': logged_in_userid, 'unique_assessments': unique_assessments}

edited Feb 17 '16 at 19:45

thesayhey

938
3
17
38

answered Feb 17 '16 at 06:17

Sergey

11,892
2
41
52

Intead of a `dict` you can also use more efficient `set` if the contained objects support this. see `set.add()`. I am not sure if SQLAlchemy models support or no. – Mikko Ohtamaa Feb 17 '16 at 11:45
@MikkoOhtamaa set() does work and is fast. How would I do `set()` instead of a `dict[]` ? Last thing, with View code is using a dict or set in poor practice? – thesayhey Feb 17 '16 at 12:55
@thesayhey: You can probably construct an SQLAlchemy query which returns directly what you want with one line of code. However this is another question and not specific to Pyramid or their views. – Mikko Ohtamaa Feb 17 '16 at 16:03
Traceback Error: `_requestonly_view response = view(request) File "/Users/ack/code/venv/WEB/web/views/default.py", line 274, in view_unique_assessments if x['microseries'] in assessments_by_mircoseries: TypeError: 'Assessment' object has no attribute '__getitem__' ` – thesayhey Feb 17 '16 at 17:00

Looping through a list of object/values, remove duplicates, and return unique value in View (python)

1 Answers1