0

Disclosure: I am not an expert in this, nor do I have much practice with this. I have however, spent several hours attempting to figure this out on my own.

I have an excel sheet with thousands of object serial numbers and addresses where they are located. I am attempting to write a script that will search columns 'A' and 'B' for either the serial number or the address, and returns the whole row with the additional information on the particular object. I am attempting to use python to write this script as I am trying to integrate this with a preexisting script I have to access other sources. Using Pandas, I have been able to load the .xls sheet and return the values of all of the spreadsheet, but I cannot figure out how to search it in such a way as to only return the row pertaining to what I am talking about.

excel sheet example

Here is the code that I have:

import pandas as pd
data = pd.read_excel('my/path/to.xls')
print(data.head())

I can work with the print function to print various parts of the sheet, however whenever I try to add a search function to it, I get lost and my online research is less than helpful. A.) Is there a better python module to be using for this task? Or B.) How do I implement a search function to return a row of data as a variable to be displayed and/or used for other aspects of the program?

D3alWyth1t
  • 69
  • 3
  • 11
  • can u share data.head() ? so we have an idea of what you want. also create an example of what your final output should be like . use this as a guide : https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Feb 02 '20 at 00:03
  • How about using openpyxl? – rivques Feb 02 '20 at 00:03
  • `data[data['columnname'] == value]` - or - `data.query('colmnname' == value)` ,,, [Selection (cookbook)](https://pandas.pydata.org/docs/user_guide/cookbook.html#selection) ... [Dataframe filtering rows by column values](https://stackoverflow.com/questions/44482095/dataframe-filtering-rows-by-column-values) ... [How to select rows from a DataFrame based on column values?](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) – wwii Feb 02 '20 at 01:04

1 Answers1

1

Something like this should work, pandas works with rows, columns and indices. We can take advantage of all three to get your utility working.

import pandas as pd

serial_number = input("What is your serial number: ")
address = input("What is your address: ")
# read in dataframe
data = pd.read_excel('my/path/to.xls')
# filter dataframe with a str.contains method.

filter_df = data.loc[(data['A'].str.contains(serial_number)) 
          | data['B'].str.contains(address)]

print(filter_df)

if you have a list of items e.g

serial_nums = [0,5,9,11]

you can use isin which filters your dataframe based on a list.

data.loc[data['A'].isin(serial_nums)]

hopefully this gets you started.

Umar.H
  • 22,559
  • 7
  • 39
  • 74