I would like to compare two excel-tables and add data from one to the other.
Specifically, I have a table for "sorting information", which is structured as follows:
street | Hnr_from | Hnr_to | hnr_cond | Postcode | City | SortInfo |
---|---|---|---|---|---|---|
Hasel | 6 | 8 | 0 | 49082 | Osnb | OS1-01 |
Hasel | 1 | 5 | 1 | 49083 | Osnb | OS1-02 |
Mosel | 4 | 6 | 2 | 49084 | Munst | MU1-01 |
The hnr_cond is there to know if only the odd (1), only the even (2), or all the house numbers (0) belong in the range "from", "to" in this district.
My Data-Frame i want to Compare this to looks like the following:
street | Hnr | Postcode | City |
---|---|---|---|
Hasel | 7 | 49082 | Osnb |
Mosel | 4 | 49084 | Munst |
My first attempt to find out how to work with Polars in general looked like this:
import polars as pl
xlsx_file = "Path/to/Sortinfo-file.xlsx"
df = pl.read_excel(xlsx_file)
searched_nr = 3
searched_str = "Mosel"
searched_postc = 49088
searched_city = "Osnbr"
filtered_df = df.filter(
(searched_nr >= df['Hnr_from']) &
(searched_nr <= df['Hnr_to']) &
(((searched_nr % 2 == 0) & (df['hnr_cond'] == '2')) |
((searched_nr % 2 == 1) & (df['hnr_cond'] == '1')) |
(df['hnr_cond'] == '0')) &
(searched_str == df['street']) &
(searched_postc == df['Postcode']) &
(searched_city == df['City'])
)
SortInfo = None
if not filtered_df.is_empty():
bezeichnung = filtered_df.to_pandas()['bezirk'].iloc[0]
print(SortInfo)
I also tried to load the other Dataframe as a file but i only got that far that it read the first line and compared it to the SortInfo_df but it dismissed everything after the first line.
My End Result should be like the Following:
A Dataframe with:
street | Hnr | Postcode | City | SortInfo |
---|---|---|---|---|
Hasel | 7 | 49082 | Osnb | OS1-01 |
Mosel | 4 | 49084 | Munst | MU1-01 |
I tried to load the other file like this:
xlsx_file = "path/to/other-file.xlsx"
df_searched = pl.read_excel(xlsx_file)
searched_number = df_seached['Hnr']
searched_street = df_searched['street']
searched_postcode = df_searched['Postcode']
searched_City = df_searched['City']
Note i quckliy translated everything that was not in english. If i missed something or did something wrong please inform me.