We have existing code to get some material properties for many materials (>60,000).
from pymatgen import MPRester
mpr = MPRester(api_key="")
criteria={"nelements":{'$lt':4}}
properties=["pretty_formula","cif","material_id", "formation_energy_per_atom", "band_gap"]
c = mpr.query(criteria=criteria,properties=properties)
But for this project we need the information in a specific form, namely in structures. I can get this structures easily by calling them for every material ID individually:
structures = []
for mid in mid_list:
structures.append(mpr.get_structure_by_material_id(mid))
Which calls this function in matproj.py:
def get_structure_by_material_id(self, material_id, final=True,
conventional_unit_cell=False):
"""
Get a Structure corresponding to a material_id.
Args:
material_id (str): Materials Project material_id (a string,
e.g., mp-1234).
final (bool): Whether to get the final structure, or the initial
(pre-relaxation) structure. Defaults to True.
conventional_unit_cell (bool): Whether to get the standard
conventional unit cell
Returns:
Structure object.
"""
The problem is, that this takes very long (>4 hours) and sometimes gets stuck during the call to the API.
Is there a way to avoid calling the API 60,000 times and convert the initial query results instead?