-1

I'm very new to Python, so dumbed-down explanations will be greatly appreciated. I have data that I have read in from a csv file that I have manipulated to be in the form: [(t1,n1,v1),(t2,n2,v2),(t3,n3,v3),...]

What I'm trying to do is, given a non-zero value in v, find the position of the next occurrence of n that has a zero value for v, and determine the difference in t values. Here is my code so far:

d=[]
for i,x in enumerate(list):
    if x[2]!=0:
        for j,y in enumerate(list):
            if x[1]==y[1] and j>i and y[2]==0:
                d.append(y[0]-x[0])
    else: d.append(0)

print d

I did this in excel using MATCH and OFFSET functions, but I'm a bit lost transitioning to index and enumerate here.

My first problem is that the nested for loop doesn't stop when it finds the first match, and so it keeps appending t value differences for every matching n value. I'd only like to find the first match.

My second query is if there's a better way to do this, so the nested for loop isn't always starting at the beginning of my list, and instead starts at the index i. I'm dealing with quite large data sets.

EDIT: I managed to make it work, though it's quite inelegant (times and notes are lists of the 1st and 2nd elements of each tuple in list):

d=[]
for i,x in enumerate(list):
    if x[2]!=0:
        d.append(notes[i+1:].index(x[1]))
    else: d.append("NA")


dur=[]
for i,j in enumerate(d):
    if j!="NA":
        dur.append(times[i+j+1]-times[i])
    else: dur.append(0)

I'd appreciate any ideas on a cleaner solution.

Jonas
  • 121,568
  • 97
  • 310
  • 388
adrianportell
  • 23
  • 1
  • 3
  • Could you please provide some code to clearly show what parts you are having trouble with? – idjaw Sep 30 '15 at 01:23
  • 2
    Welcome to StackOverflow. We're not a code-writing or design service: you're supposed to show us the code you have and the results you got; then we help you work toward the results you want. Python's index finds the next location of an item in a list. You can extract a slice of your list with something like this::: middle = [x[1] for x in data_list] ::: Does this help you move ahead? – Prune Sep 30 '15 at 01:26
  • Sorry about the very poorly phrased question. I've now clarified. – adrianportell Sep 30 '15 at 06:11

1 Answers1

0

First note, it's not great to have a list named list. I wasn't 100% clear on what you were looking for, but I think this works.

d = []
for index, tup in enumerate(lst):
    if tup[2] != 0:
        for next in lst[index + 1:]:
            if next[2] == 0 and tup[1] == next[1]:
                d.append(next[0] - tup[0])
                break
        if len(d) - 1 != index:
            d.append('NA')
    else:
        d.append('NA')

For example:

input: lst = [(1,3,0),(1,5,6),(1,2,4),(3,4,1),(4,2,0),(7,5,0),(8,4,0)]

output: d = ['NA', 6, 3, 5, 'NA', 'NA', 'NA']

If you only need the times and don't need the arrays to line up, just remove any conditional that appends 'NA'.

Community
  • 1
  • 1
Hayley Guillou
  • 3,953
  • 4
  • 24
  • 34
  • That's great, thanks! Can I ask: I understand that the `if len(d)-1!=index:` makes sure that there is a match for `tup[1]` somewhere down the line, but how exactly does it achieve this? – adrianportell Oct 01 '15 at 05:36
  • @adrianportell That's just for lining up the lists. If you didn't find a match in the rest of the list, add the filler value to the list so the indexes line up. – Hayley Guillou Oct 01 '15 at 05:41