3

Problem / what I tried

I downloaded the textmining 1.0 library which I tried to run, however this gave me some import errors (because this is a python 2 lib) so I searched on stackoverflow and found out that I had to use 2to3.py and now everything is working. However when I do this:

def buildMatrix(self,document_list):
        print("building matrix...")
        tdm = textmining.TermDocumentMatrix()
        for doc in document_list:
             tdm.add_doc(doc)
        tdm.write_csv(r'path\matrix.csv', cutoff=2)

(document_list is just a list of strings) I get the following error:

  File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
    f.writerow(row)

TypeError: a bytes-like object is required, not 'str'

I'm pretty sure the row should be a string while inspecting the code of textmining 1.0. So I wanted to print this row by editing the source code:

f = csv.writer(open(filename, 'wb'))
        for row in self.rows(cutoff=cutoff):
            print(row)
            f.writerow(row)

However even now I get the same TypeError:

  File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
    print(row)

TypeError: a bytes-like object is required, not 'str'

I searched on stack overflow to solve this by replacing the 'wb' by 'w', however this still gives me the TypeError.

Questions

  • How can I fix the code to make it able to write the row.
  • Why does even the print statement cause a TypeError

Edit based on comment(s):
Suggestion of Claudio still gave me the TypeError:

  File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
    f.write(row)

TypeError: a bytes-like object is required, not 'str'

Suggestion of Tony:
Code inspection:

for article in articles:
        abstract = searcher.getArticleAbstract(article)
        print(type(abstract)) #--> returns <class 'str'>
        all_abstracts.append(abstract)
    txtSearcher.buildMatrix(all_abstracts)

I now have these open lines:

f = open(os.path.join(data_dir, 'stopwords.txt'),"r")
f = open(os.path.join(data_dir, 'dictionary.txt'),"r")
f = csv.writer(open(filename, 'w'))

Some strange things are going on

enter image description here This will take me to:

def write_csv(self, filename, cutoff=2):
        print("This really makes me sad!")

        """
        Write term-document matrix to a CSV file.

        filename is the name of the output file (e.g. 'mymatrix.csv').
        cutoff is an integer that specifies only words which appear in
        'cutoff' or more documents should be written out as columns in
        the matrix.

        """
        print(self.rows)
        f = csv.writer(open(filename, 'w'))
        for row in self.rows(cutoff=cutoff):
            f.writerow(row)

It does print the "building matrix.." (so the function is called) however it doesn't print the print("This really makes me sad!")

CodeNoob
  • 1,988
  • 1
  • 11
  • 33
  • In Python 3 you should use text mode with csv writer, along with `newline=''`. As to why `print()` gives the same error, has it possibly been shadowed by something? – Ilja Everilä Apr 24 '17 at 08:34

2 Answers2

0

To my current knowledge the actual cause of the in the question described weird behavior of the program was the fact that the question I have asked in the comments:

Are you sure that you are getting the error from the code you are editing?

Was not considered as relevant and the only right answer explaining all of the observed issues.

All of the other detected issues like e.g.

**RENAME** def write_csv(...) to for example def my_write_csv(...)

including the provided explanations and hints like:

If you define an own function with the same name as the function in a library you run into trouble with local/global scopes and are quite lost to know WHICH ONE was actually executed? This one from the library or this one you have defined ... The fact that the print("This really makes me sad!") inserted by you wasn't printed indicates that not THIS function was executed but the library one instead ...

Check the entire code including the file to read or excerpts able to reproduce the error - there is sure a very simple explanation for this weird behavior.

Look for a not closed parenthesis or string quotation or list ], etc. in the code preceding the line in which the error is indicated.

couldn't under this circumstances lead to a success ...

Claudio
  • 7,474
  • 3
  • 18
  • 48
  • It still gives the same error (see my question edit) – CodeNoob Apr 24 '17 at 08:25
  • Post or make available the entire code ... The cause of the trouble isn't where you look at it ... Maybe you are running the wrong file??? – Claudio Apr 24 '17 at 08:26
  • Does a csv writer object even have a method `write()`? – Ilja Everilä Apr 24 '17 at 08:38
  • print(dir(row)) too? Are you **sure** that you are getting the error from the code you are editing? – Claudio Apr 24 '17 at 08:47
  • If I change it to my_write_csv and use the "Go to definition" it will bring me to the edited function, however running the code gives me "'TermDocumentMatrix' object has no attribute 'my_write_csv'" – CodeNoob Apr 24 '17 at 09:16
  • Thankyou I solved it I used imp.reload() and edited the code as described by Tony – CodeNoob Apr 24 '17 at 09:51
0

Use updated package of textmining3 instead of textmining 1

https://pypi.org/project/textmining3/

It will resolve above issue.