0

I hope i can have some help with this problem:

I have two text file made up of about 10.000 rows (let's say File1 and File2) comng from a FEM analysis. The structure of the files is:

File1

        ....
     Element           Facet            Node  CNORMF.Magnitude     CNORMF.CNF1     CNORMF.CNF2     CNORMF.CNF3          CPRESS         CSHEAR1         CSHEAR2  CSHEARF.Magnitude    CSHEARF.CSF1    CSHEARF.CSF2    CSHEARF.CSF3

         881               3            6619              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         881               3            6648              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         881               3            6653              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6452              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6483              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
         930               3            6488              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7722              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7724              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        1244               2            7754              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2380               2            3757     304.326E-06    -123.097E-06    -203.689E-06    -189.663E-06     564.697E-06    -281.448E-06     22.5357E-06     152.710E-06     144.843E-06    -26.7177E-06    -40.3387E-06
        2380               2            3826     226.603E-06    -85.9859E-06    -161.270E-06    -133.967E-06     270.594E-06    -134.865E-06     10.7988E-06     117.700E-06     116.217E-06    -4.67318E-06    -18.0298E-06
        2380               2            3848     10.4740E-03    -2.01174E-03    -6.63900E-03    -7.84743E-03     771.739E-06    -384.638E-06     30.7983E-06     5.24148E-03     5.12795E-03    -541.446E-06    -940.251E-06
        2894               2            8253              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2894               2            8255              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        2894               2            8270              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3372               2            5920              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3372               2            5961     52.7705E-03     12.2948E-03    -40.8019E-03    -31.1251E-03     7.36309E-03    -2.56505E-03    -502.055E-06     18.8167E-03     17.9038E-03     2.12060E-03     5.38774E-03
        3372               2            5996              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6782              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6852              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3936               3            6857              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6410              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6452              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3937               4            6488              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6940              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6941              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        3955               2            6993              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        4024               2            8027              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.
        4024               2            8050              0.              0.              0.              0.              0.              0.              0.              0.              0.              0.              0. 
        ....

File2

        ....
        Node  COORD.Magnitude     COORD.COOR1     COORD.COOR2     COORD.COOR3     U.Magnitude            U.U1            U.U2            U.U3
           1         131.691         14.5010        -92.2190        -92.8868         1.93638     188.252E-03        -1.64949    -996.662E-03
           2         131.336         10.9038        -92.2281        -92.8663         1.93341     188.250E-03        -1.64672    -995.468E-03
           3         132.130         18.7534        -92.4681        -92.5002         1.93968     188.190E-03        -1.65258    -997.959E-03
           4         130.769         1.97638        -92.5186        -92.3953         1.92580     188.179E-03        -1.63965    -992.387E-03
           5         130.560        -4.04517        -93.1433        -91.3993         1.92030     188.026E-03        -1.63459    -990.122E-03
           6         132.422         24.0768        -93.9662        -90.1454         1.94282     187.819E-03        -1.65564    -999.062E-03
           7         130.377        -8.39503        -94.1640        -89.7827         1.91586     187.774E-03        -1.63054    -988.235E-03
           8         126.321         13.6556        -88.0641        -89.5278         1.93579     192.554E-03        -1.64736    -998.202E-03
           9         125.963         4.31065        -88.6558        -89.3771         1.92786     192.145E-03        -1.64012    -994.852E-03
          10         130.037         3.02359        -94.4877        -89.2894         1.92501     187.692E-03        -1.63909    -991.871E-03
          11         126.692         18.5888        -88.1164        -89.1107         1.93970     192.653E-03        -1.65097    -999.810E-03
          12         125.751        -1.96189        -89.1238        -88.6928         1.92231     192.010E-03        -1.63500    -992.572E-03
          13         125.719        -3.46723        -89.2798        -88.4437         1.92094     191.971E-03        -1.63373    -992.005E-03
          14         130.026         7.42596        -95.0372        -88.4289         1.92818     187.556E-03        -1.64210    -993.086E-03
          15         130.736         16.3557        -95.3755        -87.9092         1.93527     187.472E-03        -1.64873    -995.891E-03
          16         130.251        -12.8122        -95.5572        -87.5783         1.91105     187.430E-03        -1.62618    -986.163E-03
          17         130.250         12.8770        -95.6602        -87.4548         1.93216     187.401E-03        -1.64586    -994.616E-03
          18         125.609        -7.73838        -90.1949        -87.0785         1.91668     191.718E-03        -1.62985    -990.191E-03
          19         124.466        -6.21492        -88.8834        -86.9075         1.91827     192.783E-03        -1.63095    -991.270E-03
          20         126.958         23.9470        -89.5421        -86.7584         1.94289     192.337E-03        -1.65406        -1.00096
          21         121.210         6.64491        -84.7929        -86.3587         1.92993     196.112E-03        -1.64059    -997.316E-03
          22         121.369         12.5781        -84.3620        -86.3434         1.93495     196.450E-03        -1.64514    -999.468E-03 
        ....

I want to do the following step:

  1. remove the first two column from the File1
  2. compare the node label for the two files
  3. write an output text file in "rpt" format containing the rows having the same "node label" side by side

Here is the code I have used. It looks like it works for small file. But for large file, it takes a huge amound of time.

nodEl = open("P:/File1.rpt", "r")
uniNod = open("P:/File2.rpt", "r")

row_nodEl  = nodEl.readlines()
row_uniNod = uniNod.readlines()

nodEl.close()
uniNod.close()

output = open("P:/output.rpt", "w")

for index, line in enumerate(row_nodEl):
    if index > 23081 and index < 40572 and index !=23083 and index !=23084:
        temp  = line.strip()
        temp2 = " ".join(temp.split()) 
        var   = temp2.split(" ",3) 
        for index2, line2 in enumerate(row_uniNod):
            if index2 > 11412 and index2 < 21258 and index2 != 11414 and index2 !=11415: 
                temp3 = line.strip()
                temp4 = " ".join(temp3.split())
                var2  = temp4.split(" ",1)
                if var[2] == var2[0]:
                    output.write("%s" %var[2]) + " " + "%s" %var[3] + " " + "%s" %var2[1])

Any suggestion is more then welcome!

drSlump
  • 307
  • 4
  • 16

1 Answers1

1

You are comparing each line of one file (with m lines) to each line of another file (with n lines). This leads to a time complexity O(m*n). What this means is that two files, each having 10,000 lines, will produce 100,000,000 comparisons.

You could speed up your code if you change how you read values. Consider reading a file into a dictionary instead into a list. Each key in the dictionary would be a node number and each value would be the complete line.

Using this approach, you could do the following:

  1. Load the first file into a dictionary
  2. Load the second file into a dictionary
  3. For each node from the first dictionary, find the corresponding node in the second dictionary

Using Python, it would look similar to this

file_contents_1 = load_file("P:/File1.rpt")
file_contents_2 = load_file("P:/File2.rpt")

for node_label in file_contents_1:
    # Skip processing node which doesn't have corresponding values in the second file
    if not node_label in file_contents_2:
        continue
    # Do something

The benefit of this approach is that you load the files separately, meaning that time complexity now becomes linear O(m+n). When looking for a corresponding node in the second file, you have a constant time complexity because of the way dictionaries are implemented (i.e. hash tables).

This should make your code a lot faster.

hgazibara
  • 1,832
  • 1
  • 19
  • 22
  • you might load the first as a dictionary and loop through the second sequentially as well. Not sure how that would impact performance. – agentp Aug 04 '16 at 14:52
  • @agentp That would work, too. It would probably be less memory intensive. – hgazibara Aug 04 '16 at 18:49
  • Thanks for the answer. I have loaded both file as dictionary and then compared them using your approach. Everything works great now. @agentp do you mean to load only one file as dictionary and then compare each label of it with each row of the second text file? – drSlump Aug 08 '16 at 08:53