0

I have two separate data-sets that I am working with:

The primary database, containing retail data (POS), has an item-code column, e.g (12831903, E03JD920). (I imported this as a DF AS importing as tibble threw parsing error)

The secondary database has the item-codes and the corresponding item description. I put this into an R list:

itemlist <- setNames(as.list(itemcodes$l1desc),itemcodes$itemcode)

The main dataset is stored as main. I am attempting to write a function / loop that checks every single row of the column itemcode in main and see if there is a match in itemlist. If there is a match, the function/loop pastes the corresponding value from the matching key into a new vector. This process loops until every single row has been found a match for.

How can I write this loop?

The image below shows what the list looks like

The second image is a snippet of what the main dataframe looks like

So to summarise again, I am looking to write a function that parses through the list to match values with the itemcodes in the main df. My eventual goal is to take the new vector with the matching item descriptions, and then merge that with the main-dataframe once again.

for(i in 1:nrow(transactions)) {
  if (transactions[i,4] == itemlist[i]) {
    #if item code in pos log is equal to item code in list, paste corresponding 
    ## value from key-value pair
    mapped_items[i] <- paste0(names(itemlist))
  }}

View(mapped_items)

This is what my pseudo-code looks like, I tried running a few initial variants of this, but kept getting thrown an error that NAs were being parsed through, but this may have been because of the tibble. Any ideas as to how I can elegantly implement this function/loop?

I am also filling to try this with dict in python if necessary

My python pseudocode is as follows:

def search(itemcode, code):
  match "" " "
for i in itemcode:
  if (code == i):
  match = code

return match


def mapitems(transactions,itemcode):
  
  mappedlist = {}
i = 0 

for code in transactions:
  mappedlist.append(search(itemcode,code))

return mappedlist

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Seems better suited to a merge operation: https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Images of data aren't helpful because we can't copy/paste the data in R for testing. – MrFlick Feb 12 '21 at 07:04

1 Answers1

1

Actually, you should be using neither R nor Python to do what would be a very basic operation on your actual SQL database:

SELECT *
FROM item_codes ic
INNER JOIN retail_data rd
    ON rd.item_code = ic.item_code

Just execute the above query, and then access the result set as some kind of collection in your R or Python script.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360