Problem:
I have one folder(json_folder_large) which holds more than 200, 000 json files inside, another folder(json_folder_small) which holds 10, 000 json files inside.
import os
lst_file = os.listdir("tmp/json_folder_large") # this returns an OSError
OSError: [Errno 5] Input/output error: 'tmp/json_folder_large'
I got an OSError when I use listdir with directory path. I am sure there is no problem with the path because I can do the same thing with the other folder without this OSError.
lst_file = os.listdir("tmp/json_folder_small") # no error with this
Env:
Problem above is with docker image as pycharm interpreter.
When the interpreter is conda env, there is no errors.
The only difference here I could see is that in my docker/preferences/resources/advanced, I set 4 CPU(max is 6) and 32GB memory(max is 64).
I tried:(under docker)
1. With Pathlib
import pathlib
pathlib.Path('tmp/json_folder_large').iterdir() # this returns a generator <generator object Path.iterdir at 0x7fae4df499a8>
for x in pathlib.Path('tmp/json_folder_large').iterdir():
print("hi")
break
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/usr/local/lib/python3.7/pathlib.py", line 1074, in iterdir for name in self._accessor.listdir(self):
OSError: [Errno 5] Input/output error: 'tmp/json_folder_large'
2. With os.scandir
os.scandir("tmp/json_folder_large") # this returns a generator <posix.ScandirIterator object at 0x7fae4c48f510>
for x in os.scandir("tmp/json_folder_large"):
print("hi")
break
Traceback (most recent call last):
File "<input>", line 1, in <module>
OSError: [Errno 5] Input/output error: 'tmp/json_folder_large'
3.Connect pycharm terminal to docker container, then do ls
docker exec -it 21aa095da3b0 bash
cd json_folder_large
ls
Then I got an error(when the terminal is not connected to docker container, the code above raise no error!!!!!)
ls: reading directory '.': Input/output error
Questions:
- Is it really because of the memory issue?
- Is it possible to solve this error while everything is under the same directory? (I see we could split those files into different directories)
- Why my code raise error under docker but not conda env?
Thanks in advance.