Questions tagged [boolean-indexing]
27 questions
0
votes
1 answer
Pandas dataframe data validation methods
Currently using a dataframe to store information on data we've collected. Prior to submitting the data, we need to validate the data based off a list of rules. Trying to set up these validations in python, and part of the problem is readability vs.…

Sea Bacon
- 3
- 2
0
votes
1 answer
Pandas Categorical Ignores Boolean Slicing? (remove "unused" Categories)
Often times I have to convert even continuous data into a categorical datatype, since it helps my statistical analysis.
When I apply boolean indexing (values < 11) to categorical columns, they are not sliced as expected:
import matplotlib.pyplot as…

markur
- 147
- 8
0
votes
2 answers
Compare 2 DataFrames and drop rows that do not contain corresponding ID variables
I need to compare 2 DataFrames and drop rows in either that do not contain the corresponding IDs. As an example consider df1 and df2.
df1 = pd.DataFrame({'ID':[1,2,3,4],
'Food':['Ham','Cheese','Egg','Bacon',],
…

John Conor
- 722
- 6
- 20
0
votes
2 answers
How to speed up pandas boolean indexing with multiple string conditions
I have a 73 million row dataset, and I need to filter out rows that match any of a few conditions. I have been doing this with Boolean indexing, but it's taking a really long time (~30mins) and I want to know if I can make it faster (e.g. fancy…

travelsandbooks
- 111
- 1
- 12
0
votes
1 answer
Advanced boolean indexing
I wanna select values by mask and changes values by use mask-array.
Code:
import numpy as np
a = np.zeros((2, 2), dtype=(np.uint8, 3))
x = np.arange(4, dtype=int).reshape((2, 2))
mask = np.logical_and(a1 < 3, a1 > 0)
a[mask] = (1, x[mask], 2)
I…

Vladislav Nekto
- 9
- 2
0
votes
1 answer
Insert a customized series as a new column in a DataFrame with Pandas
Given this DataFrame with columns: category, Year, and Profit
data = {'category':pd.Series(['A','A','A','A','A','A']),
'Year':pd.Series([1,1,3,3,3,4]),
'Profit':pd.Series([10,11,5,6,30,31])}
df = pd.DataFrame(data)
display(df)
how…

Howard
- 111
- 3
0
votes
1 answer
Star (*) within Pandas boolean indexing
Because of a typo, I happened upon some Pandas DataFrame boolean indexing syntax that I'm not familiar with and I can't find any information describing what is actually happening.
I was trying to retrieve a dataframe based on two conditions with an…

jb1225
- 41
- 3
0
votes
2 answers
Boolean Indexing numpy Array with or logical operator
I was trying to do an or boolean logical indexing on a Numpy array but I cannot find a good way.
The and operator & works properly like:
X = np.arange(25).reshape(5, 5)
# We print X
print()
print('Original X = \n', X)
print()
X[(X > 10) & (X < 17)]…

Zioalex
- 3,441
- 2
- 33
- 30
0
votes
1 answer
pandas boolean indexing of dataframe in dictionary of data frames
So, this is probably a really simple problem, but I've not found a solution yet. I apologize for my stupidity. (I'm guessing my ignorance of terminology has impeded my searching here)
I have a dictionary of dataframes (showing 2 in here, but it…

forevernoob
- 3
- 1
0
votes
2 answers
Numpy: Overlay Boolean Array on "True"s of other boolean array
I have a bool 2D-array A with the numbers of True being the dimension of bool 2D-array B.
A = np.array([[False, True, True, False, True],[False, False, False, False, False],[False, True, True, False, True]])
B = np.array([[True, False, True],[True,…

Jonas Jo
- 25
- 4
0
votes
2 answers
Boolean indexing, trying to search by label with two conditions but boolean and, bitwise &, and numpy logical_and all return errors
I am trying to return the rows of a dataframe in pandas that correspond to the label I choose. For example, in my function Female, it returns all the rows in which the patient is female. For AgeRange, I have run into issues satisfying both…
user10448598
-1
votes
1 answer
How do I capture all complying values using a mask in Pandas?
Value_counts performed in one specific data frame column shows visually that there are 441 values lower than 10. When I run a mask (boolean indexing) in order to access those values it only gets 12 of the 441.
I thought it was a datatype issue.…