I have a wide table in a format as follows (for up to 10 people):
person1_status | person2_status | person3_status | person1_type | person_2 type | person3_type
0 | 1 | 0 | 7 | 4 | 6
Where status can be a 0 or a 1 (first 3 cols).
Where type can be a # ranging from 4-7. The value here corresponds to another table that specifies a value based on type. So...
Type | Value
4 | 10
5 | 20
6 | 30
7 | 40
I need to calculate two columns, 'A', and 'B', where:
- A is the sum of values of each person's type (in that row) where status = 0.
- B is the sum of values of each person's type (in that row) where status = 1.
For example, the resulting columns 'A', and 'B' would be as follows:
A | B
70 | 10
An explanation of this:
'A' has value 70 because person1 and person3 have "status" 0 and have corresponding type of 7 and 6 (which corresponds to values 30 and 40).
Similarly, there should be another column 'B' that has the value "10" because only person2 has status "1" and their type is "4" (which has corresponding value of 10).
This is probably a stupid question, but how do I do this in a vectorized way? I don't want to use a for loop or anything since it'll be less efficient...
I hope that made sense... could anyone help me? I think I'm brain dead trying to figure this out.
For simpler calculated columns I was getting away with just np.where but I'm a little stuck here since I need to calculate the sum of values from multiple columns given certain conditions while pulling in those values from a separate table...
hope that made sense