0

Excuse me if this has been asked previously, I could not find anything.

I am trying to define a function that takes an arbitrary number of .txt files that look like these

enter image description here

reads them, concatenates all the rows and saves it into one numpy array. This works for one .txt file. As soon as i use two files I get array([nan, nan]) as a return, with three files array([nan, nan, nan]), etc.

import numpy as np

def readInSpectra(*files):
    raw = np.genfromtxt(files[0], skip_header=0, delimiter='\t')
    for i in range(1, len(files)):
        raw_i = np.genfromtxt(files[i], skip_header=0, delimiter='\t')
        raw = np.vstack((raw, raw_i))
    return raw

files = ('./file1.txt', './file2.txt', './file3.txt')

test = readInSpectra(files)
Wulfram
  • 133
  • 5
  • Why are you unpacking your input tuple ? Python will consider each element of your tuple as a different argument. – obchardon Sep 30 '20 at 13:21
  • just use `def readInSpectra(files)` instead of `def readInSpectra(*files)`. – obchardon Sep 30 '20 at 13:29
  • True, that works. I googled for "python function with unknown number of arguments" and found that I should use a tuple, like as an argument and write `function(*argument)`. I am sure it is still strong "not the way to go" an unelegant but it does now what I wanted. – Wulfram Sep 30 '20 at 13:35
  • @Wulfram You can still use your original version, just call it in another way (see my edit) – Timus Sep 30 '20 at 13:56
  • @Wulfram you do not have an unknown number of arguments, you have 1 argument with multiple elements in it, in this case 1 tuple of string. – obchardon Sep 30 '20 at 14:11

2 Answers2

1

I'm not completely sure, but I think the repeated vstack is a problem because the shape of the arrays change. Have you tried:

def readInSpectra(*files):
    
    stacking = tuple(np.genfromtxt(file, skip_header=0, delimiter='\t')
                     for file in files)

    return np.vstack(stacking)

EDIT: I think you should call the function this way

test = readInSpectra(*files)

or

test = readInSpectra('./file1.txt', './file2.txt', './file3.txt')
Timus
  • 10,974
  • 5
  • 14
  • 28
1

Both should work, I suggest you do second as suggested by @obchardon

import numpy as np


def readInSpectra_0(*files):
    files = files[0]
    raw = np.genfromtxt(files[0], skip_header=0, delimiter='\t')
    for i in range(1, len(files)):
        raw_i = np.genfromtxt(files[i], skip_header=0, delimiter='\t')
        raw = np.vstack((raw, raw_i))
    return raw

def readInSpectra_1(files):
    
    stacking = tuple(np.genfromtxt(file, skip_header=0, delimiter='\t')
                     for file in files)

    return np.vstack(stacking)

#files = ('file1.txt', 'file2.txt')
files = ('./file1.txt', './file2.txt', './file3.txt')


test = readInSpectra_1(files)
Jiadong
  • 1,822
  • 1
  • 17
  • 37