-2

I find my self trying to analyse a data set and find how some variables correlate.

I need to add a loop that adds a logical test to the if statement:

Edited: Example: Take this data frame as example

In [11]: df                                                                                                                                                                          
Out[11]: 
      INPUT1     INPUT2    INPUT3  ... OUTPUT
0      8           5          6    ...    1        
1      3           2          5    ...    0
2      3           1          5    ...    1
3      1           2          5    ...    0
4      4           3          5    ...    0

I'm testing the combinations of inputs to check how they match the output

def greater_than(a,b):  
    return a > b  
                             
def greater_equal_than(a,b): 
    return a >= b  
                             
def lower_equal_than(a,b):  
    return a <= b  
                             
def lower_than(a,b):  
    return a < b  
                             
def equal(a,b):  
    return a == b  
 
operation = { '>': greater_than, '>=': greater_equal_than, '<=': lower_equal_than, '<': lower_than } 

escenario = pd.DataFrame(columns=['esc','pf'])
for i in range(len(names)):
    for j in names[i+1:]: 
        for op in operation:  
            escenario['esc'] = df.apply(lambda x : 1 if operation[op]( names[i], j ) else 0, axis=1)
            escenario['pf'] =  df['OUTPUT'] 
            match = escenario.apply(lambda x : 1 if x['pf'] == 1 and x['pf'] == x['esc'] else 0, axis=1 )
            percent_match = (100 *               match.sum())/escenario['pf'].sum() 
            percent_no_match = (100 *(escenario['esc'].sum() - match.sum())) / escenario['esc'].sum()
            print( f"{names[i]} {op} {j} -> { percent_match } / {percent_no_match} " )

I need to check all the combinations of input combinations that keeps percent_match closer to a 100% and percent_no_match closer to 0%

for example:

first iteration:
INPUT2 < INPUT3

SECOND INTERATION 
INPUT2 < INPUT3 and INPUT1 > INPUT2

Right now I'm running the code, sorting the print and getting the couple where the match is closer to 100 and the modifying the code to add the match, Example:

First run better output is  INPUT2 < INPUT3

Then I modify this line:

escenario['esc'] = df.apply(lambda x : 1 if operation[op]( names[i], j ) else 0, axis=1)

to add the first output, like:

escenario['esc'] = df.apply(lambda x : 1 if df['INPUT2'] < DF['INPUT3'] and operation[op]( names[i], j ) else 0, axis=1)

and check again... This last part is the one I want to automate through a loop. Thanks

josseossa
  • 51
  • 8
  • At this point, someone who knows nothing about your task, really still knows nothing. In order for someone to begin to understand we would need to see example inputs matched to example outputs AND what you have already tried – JonSG Jun 26 '21 at 19:08

1 Answers1

0

I found self modifying python script that fits perfectly into my need.

It allows to recreate a function about of a text string and that's exactly what I need.

Thanks!

josseossa
  • 51
  • 8