Problem / what I tried
I downloaded the textmining 1.0
library which I tried to run, however this gave me some import errors (because this is a python 2 lib) so I searched on stackoverflow and found out that I had to use 2to3.py
and now everything is working. However when I do this:
def buildMatrix(self,document_list):
print("building matrix...")
tdm = textmining.TermDocumentMatrix()
for doc in document_list:
tdm.add_doc(doc)
tdm.write_csv(r'path\matrix.csv', cutoff=2)
(document_list is just a list of strings
)
I get the following error:
File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
f.writerow(row)
TypeError: a bytes-like object is required, not 'str'
I'm pretty sure the row should be a string
while inspecting the code of textmining 1.0
. So I wanted to print this row by editing the source code:
f = csv.writer(open(filename, 'wb'))
for row in self.rows(cutoff=cutoff):
print(row)
f.writerow(row)
However even now I get the same TypeError
:
File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
print(row)
TypeError: a bytes-like object is required, not 'str'
I searched on stack overflow to solve this by replacing the 'wb'
by 'w'
, however this still gives me the TypeError.
Questions
- How can I fix the code to make it able to write the row.
- Why does even the print statement cause a
TypeError
Edit based on comment(s):
Suggestion of Claudio still gave me the TypeError
:
File "C:\Users\RICK\Anaconda\lib\site-packages\textmining\__init__.py", line 335, in write_csv
f.write(row)
TypeError: a bytes-like object is required, not 'str'
Suggestion of Tony:
Code inspection:
for article in articles:
abstract = searcher.getArticleAbstract(article)
print(type(abstract)) #--> returns <class 'str'>
all_abstracts.append(abstract)
txtSearcher.buildMatrix(all_abstracts)
I now have these open
lines:
f = open(os.path.join(data_dir, 'stopwords.txt'),"r")
f = open(os.path.join(data_dir, 'dictionary.txt'),"r")
f = csv.writer(open(filename, 'w'))
Some strange things are going on
def write_csv(self, filename, cutoff=2):
print("This really makes me sad!")
"""
Write term-document matrix to a CSV file.
filename is the name of the output file (e.g. 'mymatrix.csv').
cutoff is an integer that specifies only words which appear in
'cutoff' or more documents should be written out as columns in
the matrix.
"""
print(self.rows)
f = csv.writer(open(filename, 'w'))
for row in self.rows(cutoff=cutoff):
f.writerow(row)