2

Here's the code

sys.path.append( "../tools/" )
from parse_out_email_text import parseOutText #(its just another .py file that has a function I wrote)

from_sara  = open("from_sara.txt", "r")
from_chris = open("from_chris.txt", "r")

from_data = []
word_data = []

temp_counter = 0

for name, from_person in [("sara", from_sara), ("chris", from_chris)]:
  for path in from_person:
    ### only look at first 200 emails when developing
    ### once everything is working, remove this line to run over full dataset
    temp_counter += 1
    if temp_counter < 200:
        path = os.path.join('..', path[:-1]) #(THIS IS THE PART I CAN'T GET MY HEAD AROUND)
        print path
        email = open(path, "r")

        email.close()

print "emails processed"
from_sara.close()
from_chris.close()

When I run this, it gives me an error as shown below:

Traceback (most recent call last):
..\maildir/bailey-s/deleted_items/101.
File "C:/Users/AmitSingh/Desktop/Data/Udacity/Naya_attempt/vectorize_text.py", line 47, in <module>
email = open(path, "r")
IOError: [Errno 2] No such file or directory: '..\\maildir/bailey-s/deleted_items/101.'

I don't even have this """'..\maildir/bailey-s/deleted_items/101.'""" directory path on my laptop, I tried to change the path by replacing the '..' in the code by the actual path name to the folder where I keep all the files, and nothing changes.

path = os.path.join('..', path[:-1])

This code is part of an online course on machine learning and I have been stuck at this point for 3 hours now. Any help would be really appreciated.

(P.S. This is not a homework question and there are no grades attached to this, its a free online course)

Amit Singh Parihar
  • 527
  • 3
  • 14
  • 23

4 Answers4

1

your test data is not there so it cannot find it. you should run start-up code again and make sure the necessary maildir are all there.

Alice Sung
  • 26
  • 2
1

Go to tools inside your udacity project directory and run startup.py. It is about 400 Mb so sit back and relax!

Akash Jain
  • 267
  • 5
  • 9
0

I know this is extremely late, but I found this post after having the exact same problem.

All the answers that I found here and on other sites, even the issue requests in the original github, were just "run startup.py" I already did that. However, it was telling me:

Traceback (most recent call last):

  File "K:\documents\Udacity\Mini-Projects\ud120-projects\text_learning\vectorize_text.py", line 48, in <module>
    email = open(path, "r")

FileNotFoundError: [Errno 2] No such file or directory: '..\\maildir/bailey-s/deleted_items/101.'

Just like yours. I then found where this file was located and it was indeed on my computer

I added 'tools' to the os.path.join() line as you can see here:

for name, from_person in [("sara", from_sara), ("chris", from_chris)]:
    for path in from_person:
        ### only look at first 200 emails when developing
        ### once everything is working, remove this line to run over full dataset
        temp_counter += 1
        if temp_counter < 200:
            #path = os.path.join('..', path[:-1])  <---original
            path = os.path.join('..','tools', path[:-1])
            print(path)
            email = open(path, "r")

This worked for me finally. So, I hope it helps anyone else that stumbles on this problem in the future.

Also, I noticed on some examples I found of other repos of the lessons. Their 'tools' folder was named 'utils'.

Here is an example, this is a repo that someone tweaked to use jupyter notebooks to run the lessons So, use the one that you have.

0
  • In your Udacity course folder, first go to tools directory, check if you have maildir folder present and if it has got subfolders in it, if they are present then go back to text_learning/vectorize_text.py, find this line of code path = os.path.join('..', path[:-1]), change it to path = os.path.join('../tools/', path[:-1]),

  • On terminal, cd text_learning , then python vectorize_text.py, this should solve the issue.

  • If this does not solve the issue, then Go to tools inside your udacity project directory and run startup.py. Wait till the process is complete

  • Repeat step 1.