Parsing the data from Wikipedia takes an unacceptably long time. I want to do instead of one thread\process, at least 5. After googling I found that in Python 3.5 there is async for
.
Below is a "very short" version of the current "synced" code to show the whole proccess (with comments to quickly understand what the code does).
def update_data(region_id=None, country__inst=None, upper_region__inst=None):
all_ids = []
# Get data about countries or regions or subregions
countries_or_regions_dict = OSM().get_countries_or_regions(region_id)
# Loop that I want to make async
for osm_id in countries_or_regions_dict:
names = countries_or_regions_dict[osm_id]['names']
if 'wiki_uri' in countries_or_regions_dict[osm_id]:
wiki_uri = countries_or_regions_dict[osm_id]['wiki_uri']
# PARSER: From Wikipedia gets translations of countries or regions or subregions
translated_names = Wiki().get_translations(wiki_uri, osm_id)
if not region_id: # Means it is country
country__inst = Countries.objects.update_or_create(osm_id=osm_id,
defaults={**countries_regions_dict[osm_id]})[0]
else: # Means it is region\subregion (in case of recursion)
upper_region__inst = Regions.objects.update_or_create(osm_id=osm_id,
country=country__inst,
region=upper_region__inst,
defaults={**countries_regions_dict[osm_id]})[0]
# Add to DB translated names from wiki
for lang_code in names:
###
# RECURSION: If country has regions or region has subregions, start recursion
if 'divisions' in countries_or_regions_dict[osm_id]:
regions_list = countries_or_regions_dict[osm_id]['divisions']
for division_id in regions_list:
all_regions_osm_ids = update_osm(region_id=division_id, country__inst=country__inst,
upper_region__inst=upper_region__inst)
all_ids += all_regions_osm_ids
return all_ids
I realized that I need to change the def update_data
to async def update_data
and accordingly for osm_id in countries_or_regions_dict
to async for osm_id in countries_or_regions_dict
,
but I could not find the information whether is it necessary to use get_event_loop()
in my case and where?, and how\where to specify how many iterations of the loop can be run simultaneously? Could someone help me please to make the loop for
asynchronous?