0

I've wrote a function which get some data from API and return to dictionary. When i executed it works ok. The problem is when I am trying to execute this function twice in "the same time" using ray library. Data are getting from api correctly but they are not added to the dictionary.

Companies = dict()

def call_company_api(company_id, dictionary):
    data = requests.get(API_CALL_COMPANY_URL.format(company_id)).json()
    # name = data['data']['krs_podmioty.nazwa'] FULL NAME
    name = data['data']['krs_podmioty.nazwa_skrocona']
    city = data['data']['krs_podmioty.adres_poczta']
    nip = data['data']['krs_podmioty.nip']
    community_id = data['data']['krs_podmioty.gmina_id']
    county_id = data['data']['krs_podmioty.powiat_id']
    voivodeship_id = data['data']['krs_podmioty.wojewodztwo_id']

    try:
        community = gminy_list[community_id]
        county = powiaty_list[county_id]
        voivodeship = wojewodztwa_list[voivodeship_id]

    except KeyError:
        community = community_id
        county = county_id
        voivodeship = voivodeship_id

    dictionary[name] = [city, county, community , voivodeship, nip]

when i execute this code bellow working ok


def call_company():
    for k in comapanies_list:
        call_company_api(k, Companies)

call_company()


print(Companies) --> {'BELKA19': ['Warszawa', 'Warszawa', 'Warszawa', 'Mazowieckie', '5252786971'], 'GSW CONSTRUCTION': ['Kraków', 'Kraków', 'Kraków', 'Małopolskie', '6762564804']}


In this case data are not added to the dictionary, do you know how to fix it? I've tried to add separately dictionaries for all call_items function but it is also don't work correctly as i expected.


ray.init()



@ray.remote
def call_l1_items():
    for k in l1:
        call_company_api(k, Companies)



@ray.remote
def call_l2_items():
    for k in l2:
        call_company_api(k, Companies)



ret_id1 = call_l1_items.remote()
ret_id2 = call_l2_items.remote()
ret1, ret2 = ray.get([ret_id1, ret_id2])

print(Companies) --> {}

1 Answers1

1

The issue is that Ray tasks execute in separate processes (as opposed to threads), and so when you define the call_l1_items function which uses the Companies dictionary, that creates a copy of the Companies dictionary on the worker processes that actually execute the task. So the remote copy of the dictionary gets mutated, but not the original copy in your main script.

You can fix this by returning the items from the function and then update the original dictionary in the main script.

Robert Nishihara
  • 3,276
  • 16
  • 17