I'm writing a Python code that processes thousands of files, puts the data of each file in a data frame, and each data frame gets appended in an array. Afterwards, it takes this array and concatenates it so that the end result is one matrix containing all the data of all the data frames.
Here is the code to illustrate:
for root, dirs, filenames in os.walk(folder_name):
for f in filenames:
if f == '.DS_Store':
continue
fullpath = os.path.join(folder_name, f)
book = open(fullpath, 'r')
data = {u[0]:u[1] for u in json.load(book)}
books.append(pd.DataFrame(data=[data], index=[f]))
df = pd.concat(books, axis=0).fillna(0).sort_index()
M = df.as_matrix()
I encounter no issue in the processing part; the for loop works perfectly. However, when I try to concatenate, the code keeps running for 20 minutes or so then the script stops with an "exit code -9". Any idea what that could mean and/or how this could be fixed?
Any suggestion would be very appreciated !