I'm a newbie programmer posting here for the first time. Any suggestions or advice would be greatly appreciated! I am working on a project that compares the contents of, say test.csv to ref.csv (both single columns containing strings of 3-4 words) and assigns a score to each string from test.csv based its similarity to the most similar string in ref.csv. I am using the fuzzywuzzy string matching module to assign the similarity score.
The following code snippet takes the two input files, converts them into arrays, and prints out the arrays:
import csv
# Load text matching module
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
# Import reference and test lists and assign to variables
ref_doc = csv.reader(open('ref.csv', 'rb'), delimiter=",", quotechar='|')
test_doc = csv.reader(open('test.csv', 'rb'), delimiter=",", quotechar='|')
# Define the arrays into which to load these lists
ref = []
test = []
# Assign the reference and test docs to arrays
for row in ref_doc:
ref.append(row)
for row in test_doc:
test.append(row)
# Print the arrays to make sure this all worked properly
# before we procede to run matching operations on the arrays
print ref, "\n\n\n", test
The problem is that this script works as expected when I run it in IDLE, but returns the following error when I call it from bash:
['one', 'two']
Traceback (most recent call last):
File "csvimport_orig.py", line 4, in <module>
from fuzzywuzzy import fuzz
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fuzzywuzzy/fuzz.py", line 32, in <module>
import utils
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fuzzywuzzy/utils.py", line 6, in <module>
table_from=string.punctuation+string.ascii_uppercase
AttributeError: 'module' object has no attribute 'punctuation'
Is there something I need to configure in bash for this to work properly? Or is there something fundamentally wrong that IDLE is not catching? For simplicity's sake, I don't call the fuzzywuzzy module in this snippet, but it works as expected in IDLE.
Eventually, I'd like to use pylevenshtein but am trying to see if my use for this script has value before I put the extra time in making that work.
Thanks in advance.