I'm trying to add new columns into a new excel with addition to existing column using pandas chained assignment. But this is so slow and memory consumption to high and showing below warning
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stab
le/user_guide/indexing.html#returning-a-view-versus-a-copy
self.data_frame.loc[self.data_frame.index[self.row_index], config.col4] = self.col4
extract_comparsion.py:71: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
For better explanation
src_file = source.xlsv
data = pd.read_excel(src_file, header=1)
df = pd.DataFrame(data, columns=['col1', 'col2'])
result = df.loc[(df['col1'] != 'con') & (df['col2'] != 'con2')]
self.data_frame = result.head(10)
for index, row in self.data_frame.iterrows():
self.row_index = index
self.data_frame.loc[self.data_frame.index[self.row_index], config.status_label] = response
self.data_frame.loc[self.data_frame.index[self.row_index], config.col4] = self.col4
self.data_frame.loc[self.data_frame.index[self.row_index], config.col5l] = self.col5
self.data_frame.loc[self.data_frame.index[
self.row_index], config.ipact_col6 = self.col6
writer = pd.ExcelWriter(config.output_file, engine='xlsxwriter')
print(writer)
self.data_frame.to_excel(writer, sheet_name='Sheet1', index=False)
writer.save()
May I know is there any other alternative for this implementation. If so please share your methods Thanks in advance
#UPDATE Suppose this source excel data
COL 1 COL2 COL3
AAA BBB CCC
DDD EEE FFF
GGG HHH III
JJJ KKK LLL
MMM NNN OOO
Based on the sql query I will get some result for col AAA is 1 and Similarly for DDD will get 2. So I need to form a final excel like below
COL 1 COL2 COL3 COL4
AAA BBB CCC 1
DDD EEE FFF 2
GGG HHH III 3
JJJ KKK LLL 4
MMM NNN OOO 5
The thing I need to pass the value of COL1 & COL2 to get query result. that is why am using this loop. Hope you got clear idea