For context, I have a master file that was manually compiled to look like the master table. In regards to the table 1 and 2 examples shown, they're all Excel filed xlsx data tables I've opened from a folder that contains scatter information relevant to the clothing item. That's why the sizing detail columns for table 1 and 2 differs but I intend to standardize it to look like the master table.
I have two Dataframes like:
Table 1:
| No | Item | Style | Color | Description | ONE SIZE | 34 | 36 | 38 | 40 | QTY | Weight |
|---- |------- |------- |------- |------------- |---------- |---- |---- |---- |---- |----- |-------- |
| 1 | 11168 | Plaid | Brown | Shoes | 1 | 1 | | 1 | | 3 | 10 |
Table 2:
| No | Item | Style | Color | Description | XS | S | M | L | XL | QTY | Weight |
|---- |------- |-------- |------- |------------- |---- |--- |--- |--- |---- |----- |-------- |
| 1 | 11169 | Velvet | Red | Jacket | 10 | 5 | 6 | 8 | 3 | 32 | 40 |
I want to merge them, to get a result like:
| No | Item | Style | Color | Description | XS | S | M | L | XL | ONE SIZE | 34 | 36 | 38 | 40 | QTY | Weight |
|---- |------- |-------- |------- |------------- |---- |--- |--- |--- |---- |---------- |---- |---- |---- |---- |----- |-------- |
| 1 | 11168 | Plaid | Brown | Shoes | | | | | | 1 | 1 | | 1 | | 3 | 10 |
| 2 | 11169 | Velvet | Red | Jacket | 10 | 5 | 6 | 8 | 3 | | | | | | 32 | 40 |
I tried using merged_data = table1.merge(table2, on=['No', 'Item', 'Style', 'Color', 'Description', 'QTY', 'Weight'])
with the idea of merging on the common columns, but I get a result like this instead:
| XS_x | S_x | M_x | L_x | ... | BRAND | XS_y | S_y | M_y | L_y | XL_y | ONE SIZE | Unnamed: 14_y | Unnamed: 15_y | Unnamed: 16_y |
|-----: |----: |----: |----: |----: |------: |-----: |----: |----: |----: |-----: |---------: |--------------: |--------------: |--------------: |
I am primarily using Python pandas to work with these tables. Should I be using join()
or merge()
for this sort of task? What is wrong with the code, and how do I fix it?