Python Lookup - Mapping Dynamic Ranges (fast)

Question

This is an extension to a question I posted earlier: Python Sum lookup dynamic array table with df column

I'm currently investigating a way to efficiently map a decision variable to a dataframe. The main DF and the lookup table will be dynamic in length (+15,000 lines and +20 lines, respectively). Thus was hoping not to do this with a loop, but happy to hear suggestions.

The DF (DF1) will mostly look like the following, where I would like to lookup/search for the decision.

Where the decision value is found on a separate DF (DF0).

For Example: the first DF1["ValuesWhereXYcomefrom"] value is 6.915 which is between 3.8>=(value)>7.4 on the key table and thus the corresponding value DF0["Decision"] is -1. The process then repeats until every line is mapped to a decision.

I was thinking to use the python bisect library, but have not prevailed to any working solution and also working with a loop. Now I'm wondering if I am looking at the problem incorrectly as mapping and looping 15k lines is time consuming.

Example Main Data (DF1):

time	Value0	Value1	Value2	ValuesWhereXYcomefrom	Value_toSum	Decision Map
1	41.43	6.579482077			0.00531021
2	41.650002	6.756817908	46.72466411	6.915187703	0.001200456	-1
3	41.700001	6.221966706	11.64727001	1.871959552	0.000959257	-1
4	41.740002	6.230847055	46.92753343	7.531485368	0.006228989	1
5	42	6.637399856	8.031374656	1.210018204	0.010238095	-1
6	42.43	7.484894608	16.24547568	2.170434793	-0.007777563	-1
7	42.099998	7.595291765	38.73871244	5.100358702	0.003562993	-1
8	42.25	7.567457423	37.07538953	4.899319211	0.01088755	-1
9	42.709999	8.234795546	64.27986403	7.805884636	0.005151042	1
10	42.93	8.369526407	24.72700129	2.954408659	-0.003028209	-1
11	42.799999	8.146653099	61.52243361	7.55186613	0	1

Example KeyTable (DF0):

ValueX	ValueY	SUM	Decision
0.203627201	3.803627201	0.040294925	-1
3.803627201	7.403627201	0.031630668	-1
7.403627201	11.0036272	0.011841521	1

score 0 · Answer 1 · answered Apr 05 '21 at 07:14

Here's how I would go about this, assuming your first DataFrame is called df and your second is decision:

def map_func(x):
    for i in range(len(decision)):
        try:
            if x < decision["ValueY"].iloc[i]:
                return decision["Decision"].iloc[i]
        except Exception:
            return np.nan
df["decision"] = df["ValuesWhereXYcomefrom"].apply(lambda x: map_func(x))

This will create a new row in your DataFrame called "decision" that contains the looked up value. You can then just query it:

df.decision.iloc[row]

Python Lookup - Mapping Dynamic Ranges (fast)

1 Answers1