0

I'm new to Python. I wrote a simple script which opens a file and with a function it appends some of the line to a generator object. Then I use this object to make a difference with another file read the same way. I got the following error:

unhashable type: 'list'.

For making the difference I'm using difflib.

Could you please explain why i get this error? I've seen how to use the difflib with f.readlines() but I do not get it because f.readlines() also returns a list.

#! /usr/bin/python

import difflib

def lineExtractor(file):
    lines = []
    for line in file:
        if line.startswith('g'):
            if lines:
                yield lines
                lines = []
        else:
            lines.append( line )
    if lines:
         yield lines

with open('testfile1.txt') as file1:
    lines1 = lineExtractor(file1)
    with open('testfile2.txt') as file2:
        lines2 = lineExtractor(file2)
        for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
                print line

Thanks

i.n.n.m
  • 2,936
  • 7
  • 27
  • 51
Minee
  • 408
  • 5
  • 12
  • 1
    "returns a list" — list of what? What does lineExtractor do? – Josh Lee Aug 23 '17 at 20:26
  • It goes through the input file line by line. If the line does not start with 'g' it appends the line to the 'lines' object. So basically it filters out evey line starting with 'g' – Minee Aug 23 '17 at 20:28
  • Right now `lineExtractor` returns something like a list of lists of strings. That's more than just filtering out lines starting with 'g'. Do you simply want it to return a list of strings? – Alex Hall Aug 23 '17 at 20:31
  • Yes I want to return only a list of string. – Minee Aug 23 '17 at 20:41

1 Answers1

0

The problem is that unified_diff needs lists of strings as input. You're giving it a generator instead. And this generator doesn't even yield strings, it yields lists of strings.

So we need to make two changes:

First,

lines1 = lineExtractor(file1)

becomes

lines1 = list(lineExtractor(file1))

Obviously, the same thing needs to be done with lines2.

Secondly,

yield lines

becomes yield ''.join(lines).

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
  • So actually by changing the yield it will generate a list of strings instead of a lists of strings? – Minee Aug 23 '17 at 21:03
  • @M.Manuel Yes, `''.join(lines)` turns the list of strings `lines` into a string. And then `list(lineExtractor(file1))` turns the generator into a list. So the result is a list of strings, which we can pass to `unified_diff`. – Aran-Fey Aug 23 '17 at 21:05
  • Thank you very much for the explication. So as I convert the generator into list at the end it does not really make sense to use a generator at all because finally the memory will be reserved for it. Am I right? – Minee Aug 23 '17 at 21:12
  • @M.Manuel Yes, the generator doesn't reduce your memory footprint in this case. But at the same time, using a generator isn't a bad thing. If you're thinking of rewriting it, don't. It's fine the way it is. – Aran-Fey Aug 23 '17 at 21:21