0

I have a log file from a mathematical simulation. I tried to parse it in Python, but I am not quite satisfied with the result. Is there any "elegant" way to loop each line and sort it in order to keep only lines with physical values and ditch the rest?

The goal is to perform various analyses using numpy. Knowing that the lines I need only contain numerical values, is there a way to "tell" python to keep only rows / lines with numerical values and ditch all the rows containing string? Thank your for your help. A sample of the log file is attached.

 5 Host 1 -- hnode146 -- Ranks 20-39
 6 Host 2 -- hnode147 -- Ranks 40-59
 7 Host 3 -- hnode148 -- Ranks 60-79
 8 Process rank 0 hnode145 36210
 9 Total number of processes : 80
10
11 STAR-CCM+ 12.02.011 (linux-x86_64-2.5/gnu4.8-r8)
12 License build date: 10 February 2015
13 This version of the code requires license version 2017.02 or greater.
14 Checking license file:
15 Checking license file:
16 Unable to list features for license file
17 1 copy of ccmppower checked out from
18 Feature ccmppower expires in
19 Thu Apr 19 17:22:54 2018
20
21 Server::start -host h
22 Loading object database:
23 Loading module: StarMeshing
24 Loading module: MeshingSurfaceRepair
25 Loading module: CadModeler
26 Started Parasolid modeler version 29.01.131
27 Loading module: StarResurfacer
28 Loading module: StarTrimmer
29 Loading module: SegregatedFlowModel
30 Loading module: KwTurbModel
31 Loading module: StarDualMesher
32 Loading module: StarBodyFittedMesher
33 Simulation database saved by:
34   STAR-CCM+ 12.02.011 (linux-x86_64-2.5/gnu4.8-r8) Fri Mar 10 20:03:37 UTC 2017 Serial
35 Loading into:
36   STAR-CCM+ 12.02.011 (linux-x86_64-2.5/gnu4.8-r8) Fri Mar 10 20:03:37 UTC 2017 Np=80
37 Object database load completed.

39 A Zeit und Datum : 2018.04.19 at 17:23:11
40
41 Startzeit: 1524151391534
42
43 Loading/configuring connectivity (old|new partitions: 1|80)
44   Domain (index 1): 1889922 cells, 5614862 faces, 1990686 verts.
45 Configuring finished
46 Reading material property database "/sw/apps/cd-adapco/12.02.011-R8/STAR-CCM+12.02.011-R8/star/props.mdb"...
47 Re-partitioning
48      Iteration     Continuity     X-momentum     Y-momentum     Z-momentum            Tke            Sdr Shear+Pressure (N)   Pressure (N)      Shear (N)
49           2001   1.076589e-01   9.570364e-01   2.588931e-01   1.984590e-01   4.028215e-03   3.964344e+01      -6.468809e+00  -1.253867e+00  -5.214942e+00
50           2002   5.987195e-02   4.004615e-01   2.597862e-01   1.808196e-01   2.819456e-03   2.537490e+01      -5.154729e+00  -1.228644e+00  -3.926085e+00
51           2003   4.824863e-02   2.048600e-01   1.359121e-01   1.103614e-01   1.384044e-03   1.623916e+01      -4.277053e+00  -1.216038e+00  -3.061015e+00
52           2004   3.684017e-02   1.322581e-01   1.350187e-01   8.827220e-02   9.023783e-04   1.039251e+01      -3.914011e+00  -1.213340e+00  -2.700671e+00
53           2005   3.224797e-02   1.093365e-01   1.059148e-01   7.461911e-02   6.307195e-04   6.650742e+00      -3.745949e+00  -1.217353e+00  -2.528596e+00
54           2006   2.788050e-02   9.180507e-02   8.311817e-02   6.417279e-02   4.603072e-04   4.256107e+00      -3.658613e+00  -1.224046e+00  -2.434567e+00
55           2007   2.332397e-02   7.688239e-02   6.222694e-02   4.860232e-02   3.534658e-04   2.723686e+00      -3.608431e+00  -1.231574e+00  -2.376857e+00
56           2008   1.916130e-02   6.201947e-02   4.645780e-02   3.654489e-02   2.833177e-04   1.743055e+00      -3.575486e+00  -1.237352e+00  -2.338134e+00
57           2009   1.600865e-02   4.780234e-02   3.909247e-02   2.959689e-02   2.370245e-04   1.115506e+00      -3.548365e+00  -1.240938e+00  -2.307427e+00
58           2010   1.389765e-02   3.570659e-02   3.492423e-02   2.537285e-02   2.055279e-04   7.138997e-01      -3.527530e+00  -1.242749e+00  -2.284781e+00
59      Iteration     Continuity     X-momentum     Y-momentum     Z-momentum            Tke            Sdr Shear+Pressure (N)   Pressure (N)      Shear (N)
60           2011   1.253570e-02   2.591702e-02   3.089287e-02   2.209728e-02   1.814997e-04   4.568718e-01      -3.511034e+00  -1.242906e+00  -2.268128e+00
61           2012   1.141436e-02   1.992464e-02   2.745902e-02   1.922942e-02   1.636478e-04   2.923702e-01      -3.498876e+00  -1.243006e+00  -2.255870e+00
62           2013   1.024511e-02   1.621655e-02   2.544053e-02   1.687660e-02   1.492828e-04   1.870937e-01      -3.489288e+00  -1.242425e+00  -2.246863e+00
63           2014   9.067693e-03   1.359007e-02   2.320886e-02   1.481687e-02   1.371763e-04   1.197299e-01      -3.482323e+00  -1.242027e+00  -2.240295e+00
64           2015   7.906450e-03   1.159567e-02   2.073906e-02   1.306014e-02   1.265825e-04   7.662597e-02      -3.479134e+00  -1.243537e+00  -2.235597e+00
65           2016   6.889290e-03   1.010569e-02   1.787383e-02   1.258395e-02   1.171344e-04   4.903984e-02      -3.479042e+00  -1.246677e+00  -2.232364e+00
66           2017   5.982303e-03   8.872579e-03   1.576665e-02   1.141871e-02   1.086443e-04   3.138620e-02      -3.480301e+00  -1.249988e+00  -2.230313e+00
67           2018   5.191895e-03   7.958489e-03   1.446382e-02   9.796685e-03   1.009937e-04   2.009149e-02      -3.482459e+00  -1.253255e+00  -2.229204e+00
68           2019   4.614927e-03   7.193031e-03   1.279295e-02   8.818100e-03   9.411761e-05   1.286594e-02      -3.484886e+00  -1.256002e+00  -2.228885e+00
69           2020   4.159939e-03   6.571088e-03   1.146195e-02   7.756150e-03   8.794392e-05   8.241197e-03      -3.487597e+00  -1.258382e+00  -2.229214e+00
70      Iteration     Continuity     X-momentum     Y-momentum     Z-momentum            Tke            Sdr Shear+Pressure (N)   Pressure (N)      Shear (N)
71           2021   3.779168e-03   5.961164e-03   1.034847e-02   6.969454e-03   8.240903e-05   5.278791e-03      -3.490138e+00  -1.260061e+00  -2.230078e+00
72           2022   3.414811e-03   5.350398e-03   9.329119e-03   6.398522e-03   7.743586e-05   3.381806e-03      -3.491624e+00  -1.260241e+00  -2.231384e+00
martineau
  • 119,623
  • 25
  • 170
  • 301
zentafun
  • 13
  • 2

3 Answers3

1

Read each line. Split on whitespace, attempt to convert each entity to a float. If the conversion fails, the line isn't kept. There's certainly a way to do this with a regex, but this should work off the top of my head.

lines_to_keep = []
for line in f.readlines():
    try:
        # Throws ValueError if `x` can't be converted to float
        [float(x) for x in line.split()] 
        # If the above line didn't throw a ValueError, keep it 
        lines_to_keep.append(line)
    except ValueError:
        continue
Zachary Cross
  • 2,298
  • 1
  • 15
  • 22
  • You'd probably need to expand this to ignore 'blank' lines that only have a line number, if those exist in your file. – Zachary Cross May 16 '18 at 15:27
  • 1
    Seems like [The Simplest Thing That Could Possibly Work](http://wiki.c2.com/?DoTheSimplestThingThatCouldPossiblyWork) which is a "good thing" in my opinion—i.e. The fields are valid (in a Python syntax sense) if `float()` can convert them successfully. Easy to debug, too. – martineau May 16 '18 at 16:02
0

If you'd like regex. This matches continuous digits separated by numeric symbols like '+-.e'.

import re

r = re.compile(r'([0-9 ]+[e.\-+]*)+\n')
lines = [line for line in open('a.log') if r.fullmatch(line)]

# all the useful lines are ...
# 49           2001   1.076589e-01   9.570364e-01   2.588931e-01   1.984590e-01   4.028215e-03   3.964344e+01      -6.468809e+00  -1.253867e+00  -5.214942e+00
# 50           2002   5.987195e-02   4.004615e-01   2.597862e-01   1.808196e-01   2.819456e-03   2.537490e+01      -5.154729e+00  -1.228644e+00  -3.926085e+00
# 51           2003   4.824863e-02   2.048600e-01   1.359121e-01   1.103614e-01   1.384044e-03   1.623916e+01      -4.277053e+00  -1.216038e+00  -3.061015e+00
Blownhither Ma
  • 1,461
  • 8
  • 18
0
import re
list_to_keep=[]
pattern= re.compile(r'[0-9 ]+[e.\-+][0-9]*',re.IGNORECASE)                
with open(f, 'rb') as csvfile:
   reader = csv.reader(csvfile, delimiter='\n')
   for row in reader:
       if(pattern.match(str(row))):
           list_to_keep.append(row)

Can use regex to find your row and keep it in list.

mad_
  • 8,121
  • 2
  • 25
  • 40