0

I have character delimited files which have different sets in the file

File

@Set 1  
0,1,2,3  
2,3,4,5  
.  
.  
@Set 2  
3,4,5,6  
4,5,6,7  
.  
.

I want to make arrays with data from each set and I will also need from which set the data is taken. I am using

with open('File', 'r') as f:
    data = {}
    numbers = []
    for line in f:
        ln = line.strip()
        if '@Set' in ln:
            data[ln] = numbers
            numbers = []
        elif ln:
            numbers.append([float(n) for n in ln.split(',')])

I can see data['@Set 1'] but I am not able to use specific columns, and I want to use numpy.genfromtxt because I will need arrays where I can access columns.

Niall Cosgrove
  • 1,273
  • 1
  • 15
  • 24
Cheesebread
  • 91
  • 1
  • 10

2 Answers2

1
with open('File', 'r') as f:
    data = {}
    numbers = []
    for line in f:
        ln = line.strip()
        if '@Set' in ln:
            data[ln] = numbers
            numbers = []
        elif ln:
            numbers.append([float(n) for n in ln.split(',')])

Each numbers should be a list of lists of floats. Passing each through np.array should convert them to 2d arrays.

for k,v in data.items():
    data[k] = np.array(v)

To use genfromtxt take advantage that it works with any input that feeds it lines

with open('File', 'r') as f:
    data = {}
    numbers = []
    for line in f:
        ln = line.strip()
        if '@Set' in ln:
            data[ln] = numbers
            numbers = []
        elif ln:
            numbers.append(ln)

for k, v in data.items()
    data[k] = np.genfromtxt(v, ...)

There are other ways you could feed a set of lines to genfromtxt, but this was the simplest that I could write without glaring errors and need of testing. In Python3 you may have to use the rb file mode.

I often test answers with code like:

txt = b"""1.23,2,3
4.34,5,6
""".splitlines()
data = np.genfromtxt(txt,delimiter=',',dtype=None)
hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

You can not use np.genfromtxt with this file's format. Once you have the list numbers you can convert into np.array with:

import numpy as np
numbers_array = np.asarray(numbers)

so you can use specific columns as you want.

Francesco Nazzaro
  • 2,688
  • 11
  • 20