11

I wish to search a database that I have in a .pkl file.

I have loaded the .pkl file and stored it in a variable named load_data.

Now, I need to accept a string input using raw input and search for the string in one specific column 'SMILES' of my dataset.

If the string matches, I need to display the whole row i.e all column values corresponding to that row.

Is that possible and if so, how should I go about it?

Rahul Agarwal
  • 4,034
  • 7
  • 27
  • 51
Devarshi Sengupta
  • 121
  • 1
  • 1
  • 4
  • Welcome to Stack Overflow, please show us what you have done so far, add some code and the results. Make sure to read [How to create a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) – lordrhodos Jun 18 '17 at 21:37

2 Answers2

17

Use boolean indexing that returns all matching rows:

df = pd.DataFrame({'a': [1,3,4],
                      'SMILES': ['a','dd b','f'],
                     'c': [1,2,0]})
print (df)
  SMILES  a  c
0      a  1  1
1   dd b  3  2
2      f  4  0

If you need to check a string only:

#raw_input for python 2, input for python 3
a = input('Enter String for SMILES columns: ') # f
#Enter String for SMILES columns: f
print (df[df['SMILES'] == a])
  SMILES  a  c
2      f  4  0

Or if you need to check a sub string, use str.contains:

a = input('Enter String for SMILES columns: ') # b 
print (df[df['SMILES'].str.contains(a)])
#Enter String for SMILES columns: b
  SMILES  a  c
1   dd b  3  2
Daniel Holmes
  • 1,952
  • 2
  • 17
  • 28
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Below code solve my problem. It search any value based on regex in single column and will return all rows based on search keyword. Please update regex according your need.

Search in single column

regex = ".*" + your search keyword + ".*"

df.loc[df['your_col_name'].str.contains(regex, regex=True, case=False)]

search in all columns

df[df.apply(lambda row: row.astype(str).str.contains(regex, regex=True, case=False).any(), axis=1)]

https://pandas.pydata.org/docs/reference/api/pandas.Series.str.contains.html

Vijay
  • 141
  • 7