1

I am learning analytics with some friends and recently I was presented with a problem to solve which I am struggling a lot. I was provided a large (around 17000 lines) of a VB.net script (as he works with this) where I am supposed to pair a sub with the Hashtags.

A sample of the code is presented below:

Sub NewEPU(bLog)

arrList= Glo_arrMR_E_Base
arrList= Filter(arrList,"E#[None]",FALSE,0)
arrList2= Glo_arrMR_A_Base
'deactivated: incorrect  SB21042016'
'arrList2= Filter(arrList2,"A#[Owned]",FALSE,0)
'arrList2= Filter(arrList2,"A#[Outstanding]",FALSE,0)

For each strMR_E in arrList
    For each strMR_A in arrList2

            HS.NoInput "E#" & strMR_E & ".A#" & strMR_A & ".V#[None]"
    Next
Next
End Sub

So basically, my new code should go through this sub (NewEPU) and return that this sub has E#, A#, and V#. A Pseudo-script I thought was:

  <read files> 
  <search for '*#'>
     <If found>
          <Search 'sub' before & read name>
          <Search 'sub' after & read name> 
     < If not found> 
          <Do nothing> 

I was thinking of dealing with Python, but NLTK is dividing the subs and not helping to create the logic above. Does anyone know how to solve this? Is there perhaps a better tool or a better language to do so?

  • The code you showed isn't .NET VB. It may be VBScript or it may be VBA. (The giveaway is the lack of parens around the arguments to the sub in the middle of the nested loops.) – Craig Aug 13 '19 at 12:53
  • Sorry for that. Thought it was vb.net as he works with it. But in any way, the problem is still the same: Read a text file and return pairs. – Rafael Castelo Branco Aug 13 '19 at 13:01

1 Answers1

0

I found a solution to the problem.

First I check the closing statements lines with an enumerate: End = [i for i, s in enumerate(script) if 'End Sub' in s]

Followed by a search for the lines of the 'Sub' word on a split string, since the sub is followed by the function name as 'End Sub' is not followed by anything: id_Sub = [i for i, s in enumerate(script) if 'Sub' in s.split()]

From there I do a search for # and retrieve the lines. from there is a simple comparison on a DataFrame