0

This is an extension to a question I posted earlier: Python Sum lookup dynamic array table with df column

I'm currently investigating a way to efficiently map a decision variable to a dataframe. The main DF and the lookup table will be dynamic in length (+15,000 lines and +20 lines, respectively). Thus was hoping not to do this with a loop, but happy to hear suggestions.

The DF (DF1) will mostly look like the following, where I would like to lookup/search for the decision.

Where the decision value is found on a separate DF (DF0).

For Example: the first DF1["ValuesWhereXYcomefrom"] value is 6.915 which is between 3.8>=(value)>7.4 on the key table and thus the corresponding value DF0["Decision"] is -1. The process then repeats until every line is mapped to a decision.

I was thinking to use the python bisect library, but have not prevailed to any working solution and also working with a loop. Now I'm wondering if I am looking at the problem incorrectly as mapping and looping 15k lines is time consuming.

Example Main Data (DF1):

time Value0 Value1 Value2 ValuesWhereXYcomefrom Value_toSum Decision Map
1 41.43 6.579482077 0.00531021
2 41.650002 6.756817908 46.72466411 6.915187703 0.001200456 -1
3 41.700001 6.221966706 11.64727001 1.871959552 0.000959257 -1
4 41.740002 6.230847055 46.92753343 7.531485368 0.006228989 1
5 42 6.637399856 8.031374656 1.210018204 0.010238095 -1
6 42.43 7.484894608 16.24547568 2.170434793 -0.007777563 -1
7 42.099998 7.595291765 38.73871244 5.100358702 0.003562993 -1
8 42.25 7.567457423 37.07538953 4.899319211 0.01088755 -1
9 42.709999 8.234795546 64.27986403 7.805884636 0.005151042 1
10 42.93 8.369526407 24.72700129 2.954408659 -0.003028209 -1
11 42.799999 8.146653099 61.52243361 7.55186613 0 1

Example KeyTable (DF0):

ValueX ValueY SUM Decision
0.203627201 3.803627201 0.040294925 -1
3.803627201 7.403627201 0.031630668 -1
7.403627201 11.0036272 0.011841521 1

1 Answers1

0

Here's how I would go about this, assuming your first DataFrame is called df and your second is decision:

def map_func(x):
    for i in range(len(decision)):
        try:
            if x < decision["ValueY"].iloc[i]:
                return decision["Decision"].iloc[i]
        except Exception:
            return np.nan
df["decision"] = df["ValuesWhereXYcomefrom"].apply(lambda x: map_func(x))

This will create a new row in your DataFrame called "decision" that contains the looked up value. You can then just query it:

df.decision.iloc[row]