0

I have a file which contains numeric as well as string/text data. I want all the numeric data to be stored in an array (ignoring non-numeric data). If the file was purely numeric I could do this using numpy.loadtxt. However, this is not the case. An example file is shown here

BEGIN FILE

SECTION1-TEXTINFO
 ------------------------------------------------------
           2.768000     0.000001     0.000001
           0.000001     2.644491    -0.000018
           0.000001    -0.000018     2.572420
 ------------------------------------------------------
SECTION2
 ------------------------------------------------------
           2.768000     0.000001     0.000001
           0.000001     2.644491    -0.000018
           0.000001    -0.000018     2.572420
 ------------------------------------------------------
 SECTION3
 ------------------------------------------------------
           0.000343    -0.000000    -0.000000
          -0.000000     0.039522    -0.000000
          -0.000000    -0.000000     0.029825
 ------------------------------------------------------
END FILE

So at the end of the day, I want to store all numeric data in a 9*3 array.

Thank you very much for your help in advance

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • Can you show the code you have tried? – eyllanesc Dec 14 '16 at 15:37
  • Well until now I was copying specific line numbers containing numeric data to a temporary file and then using numpy.loadxt("temp-file") – Anand Chandra Dec 14 '16 at 15:40
  • For examplelines = open('POSCAR').readlines() open('POSCAR-temp', 'w').writelines(lines[2:5]) open('POSCAR-temp', 'a').writelines(lines[8:]) data = np.loadtxt("POSCAR-temp") – Anand Chandra Dec 14 '16 at 15:40
  • Sorry about the messy comment. Its my first time posting on this website. So basically looking for a way to remove all non-numeric data from a file and store the remaining numeric data in an array. – Anand Chandra Dec 14 '16 at 15:42
  • The was a recent question about reading blocks of a file. http://stackoverflow.com/q/41091659/901925 – hpaulj Dec 14 '16 at 15:42

1 Answers1

0
import numpy as np


def isFloat(element):
    try:
        float(element)
        return True
    except ValueError:
        return False

with open('data.txt', 'r') as infile:
    matrix_numpy = []
    matrix = []
    for line in infile:
        str = " ".join(line.split())
        data = [float(s) for s in str.split() if isFloat(s)]
        if data:
            matrix.append(data)
    for i in range(int(len(matrix)/3)):
        matrix_numpy.append(np.asarray(matrix[3*i:3*i+3]))

    print(matrix_numpy)

Input:

BEGIN FILE

SECTION1-TEXTINFO
 ------------------------------------------------------
           2.768000     0.000001     0.000001
           0.000001     2.644491    -0.000018
           0.000001    -0.000018     2.572420
 ------------------------------------------------------
SECTION2
 ------------------------------------------------------
           2.768000     0.000001     0.000001
           0.000001     2.644491    -0.000018
           0.000001    -0.000018     2.572420
 ------------------------------------------------------
 SECTION3
 ------------------------------------------------------
           0.000343    -0.000000    -0.000000
          -0.000000     0.039522    -0.000000
          -0.000000    -0.000000     0.029825
 ------------------------------------------------------
END FILE

Output:

[array([[  2.76800000e+00,   1.00000000e-06,   1.00000000e-06],
       [  1.00000000e-06,   2.64449100e+00,  -1.80000000e-05],
       [  1.00000000e-06,  -1.80000000e-05,   2.57242000e+00]]), 
 array([[  2.76800000e+00,   1.00000000e-06,   1.00000000e-06],
       [  1.00000000e-06,   2.64449100e+00,  -1.80000000e-05],
       [  1.00000000e-06,  -1.80000000e-05,   2.57242000e+00]]),    
 array([[ 0.000343, -0.      , -0.      ],
       [-0.      ,  0.039522, -0.      ],
       [-0.      , -0.      ,  0.029825]])]
eyllanesc
  • 235,170
  • 19
  • 170
  • 241