3

I'm writing a Python code that processes thousands of files, puts the data of each file in a data frame, and each data frame gets appended in an array. Afterwards, it takes this array and concatenates it so that the end result is one matrix containing all the data of all the data frames.

Here is the code to illustrate:

for root, dirs, filenames in os.walk(folder_name):
    for f in filenames:

        if f == '.DS_Store':
            continue
        fullpath = os.path.join(folder_name, f)
        book = open(fullpath, 'r')        
        data = {u[0]:u[1] for u in json.load(book)}
        books.append(pd.DataFrame(data=[data], index=[f]))

df = pd.concat(books, axis=0).fillna(0).sort_index()        
M = df.as_matrix()

I encounter no issue in the processing part; the for loop works perfectly. However, when I try to concatenate, the code keeps running for 20 minutes or so then the script stops with an "exit code -9". Any idea what that could mean and/or how this could be fixed?

Any suggestion would be very appreciated !

tripleee
  • 175,061
  • 34
  • 275
  • 318
Lynn Bou Nassif
  • 87
  • 3
  • 12
  • 1
    What does `data` look like? My suggestion is combine your dictionaries first, *then* build a dataframe from a single dictionary. There is significant overhead to concatenating a large number of dataframes. – jpp Mar 11 '18 at 11:42
  • Negative exit codes are only possible on Windows I think, and could reflect a platform issue. Please [edit] your question to identify the precise environment. – tripleee Mar 11 '18 at 12:11
  • @jpp data is basically a dictionary in which the keys are words and the values are the number of occurrences of that word. I see what you're saying, but the purpose of this code is to create a similarity matrix for every file, so combining the dictionaries first would defeat the purpose wouldn't it? (since doing so would mean that the similarity matrix would apply to one dictionary, which might as well be understood as a single file in this case) – Lynn Bou Nassif Mar 11 '18 at 12:44
  • @tripleee I'm working with iOS though... – Lynn Bou Nassif Mar 11 '18 at 12:46
  • Possible duplicate of [sudden exit with status of -9](https://stackoverflow.com/questions/18529452/sudden-exit-with-status-of-9) – Davis Herring Mar 11 '18 at 18:33

0 Answers0