I am getting problems when migrating to a NetCDF data format from HDF5 for storing a dictionary of pandas DataFrames, which contain the data and results of a pyomo model.
The current HDF5 saving script, which functions without a problem is as follows:
import pandas as pd
def save(prob, filename):
with pd.HDFStore(filename, mode='w') as store:
for name in prob._data.keys():
store['data/'+name] = prob._data[name]
for name in prob._result.keys():
store['result/'+name] = prob._result[name]
where prob
is a solved pyomo model instance.
Since we are migrating to PyPy in our project for runtime reasons, which does not have support for h5py
at the moment, we also want to shift to NetCDF rather than HDF5 for storing our model instances.
For this I use xarray
Datasets, which seem to be compatible with the NetCDF format:
import xarray as xr
def save(prob, filename):
ds = xr.Dataset()
for name in prob._data.keys():
ds['data/'+name] = prob._data[name]
for name in prob._result.keys():
ds['result/'+name] = prob._result[name]
ds.to_netcdf(filename)
Despite looking quite analogous to the preceding HDF5 script, I get the following error here:
urbs.save(prob, os.path.join(result_dir, '{}.nc'.format(sce)))
File "/home/scandas/nas/pypy_for_asinus/urbs_pypy/urbs/saveload.py", line 63, in save
ds['data/'+name] = prob._data[name]
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 899, in __setitem__
self.update({key: value})
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 2305, in update
variables, coord_names, dims = dataset_update_method(self, other)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 580, in dataset_update_method
indexes=dataset.indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 434, in merge_core
aligned = deep_align(coerced, join=join, copy=False, indexes=indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 213, in deep_align
exclude=exclude)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 164, in align
new_obj = obj.reindex(copy=copy, **valid_indexers)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataarray.py", line 906, in reindex
indexers=indexers, method=method, tolerance=tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 1812, in reindex
tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 324, in reindex_variables
int_indexer = get_indexer_nd(index, target, method, tolerance)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/indexing.py", line 117, in get_indexer_nd
flat_indexer = index.get_indexer(flat_labels, **kwargs)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/pandas/core/indexes/multi.py", line 2042, in get_indexer
indexer = self._engine.get_indexer(target)
File "pandas/_libs/index.pyx", line 654, in pandas._libs.index.BaseMultiIndexCodesEngine.get_indexer
ValueError: operands could not be broadcast together with shapes (244,3) (4,) (244,3)
It looks like for some keys in _data
of prob
, there is mismatch of shapes (during concatenation?), creating error when assigning the elements of the xarray Dataset. However, the HDF5 storing procedure with similar assignments works smoothly without any error.
Edit: prob._data
dictionary looks as follows:
{'global_prop': value description
Property
CO2 limit 150000000 Limits the sum of all created (as calculated b..., 'site': area
Name
Mid 280000000
South 5000000000,
'commodity': price max maxperhour
Site Commodity Type
Mid Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN NaN
Gas Stock 27.0 inf inf
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 inf inf
Slack Stock 999.0 inf inf
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN
South Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN NaN
Elec buy Buy 1.0 inf inf
Elec sell Sell 3.0 inf inf
Gas Stock 27.0 inf inf
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 inf inf
Slack Stock 999.0 inf inf
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN,
'process': inst-cap cap-lo cap-up max-grad min-fraction inv-cost fix-cost var-cost wacc depreciation area-per-cap annuity-factor
Site Process
Mid Biomass plant 0 0 5000 1.200000 0.00 875000 28000 1.40 0.07 25 NaN 0.085811
Gas plant 0 0 80000 4.800000 0.25 450000 6000 1.62 0.07 30 NaN 0.080586
Hydro plant 0 0 1400 inf 0.00 1600000 20000 0.00 0.07 50 NaN 0.072460
Lignite plant 0 0 60000 0.900000 0.65 600000 18000 0.60 0.07 40 NaN 0.075009
Photovoltaics 0 15000 160000 inf 0.00 600000 12000 0.00 0.07 25 14000.0 0.085811
Slack powerplant 999999 999999 999999 inf 0.00 0 0 100.00 0.07 1 NaN 1.070000
Wind park 0 0 13000 inf 0.00 1500000 30000 0.00 0.07 25 NaN 0.085811
South Biomass plant 0 0 2000 1.200000 0.00 875000 28000 1.40 0.07 25 NaN 0.085811
Coal plant 0 0 100000 0.600000 0.50 600000 18000 0.60 0.07 40 NaN 0.075009
Feed-in 0 0 1500 inf 0.00 0 0 0.00 0.07 1 NaN 1.070000
Gas plant 0 0 100000 4.800000 0.25 450000 6000 1.62 0.07 30 NaN 0.080586
Hydro plant 0 0 0 inf 0.00 1600000 20000 0.00 0.07 50 NaN 0.072460
Photovoltaics 0 20000 600000 inf 0.00 600000 12000 0.00 0.07 25 14000.0 0.085811
Purchase 0 0 1500 inf 0.00 0 80 0.00 0.07 1 NaN 1.070000
Slack powerplant 999999 999999 999999 inf 0.00 0 0 999.00 0.07 1 NaN 1.070000
Wind park 0 0 200000 inf 0.00 1500000 30000 0.00 0.07 25 NaN 0.085811,
'process_commodity': ratio ratio-min
Process Commodity Direction
Biomass plant Biomass In 1.0000 NaN
CO2 Out 0.0000 NaN
Elec Out 0.3500 NaN
Coal plant CO2 Out 0.3000 NaN
Coal In 1.0000 1.4
Elec Out 0.4000 NaN
Feed-in Elec In 1.0000 NaN
Elec sell Out 1.0000 NaN
Gas plant CO2 Out 0.2000 NaN
Elec Out 0.6000 NaN
Gas In 1.0000 1.2
Hydro plant Elec Out 1.0000 NaN
Hydro In 1.0000 NaN
Lignite plant CO2 Out 0.4000 NaN
Elec Out 0.4000 NaN
Lignite In 1.0000 2.0
Photovoltaics Elec Out 1.0000 NaN
Solar In 1.0000 NaN
Purchase CO2 Out 0.0005 NaN
Elec Out 1.0000 NaN
Elec buy In 1.0000 NaN
Slack powerplant CO2 Out 0.0000 NaN
Elec Out 1.0000 NaN
Slack In 1.0000 NaN
Wind park Elec Out 1.0000 NaN
Wind In 1.0000 NaN,
'transmission': eff inv-cost fix-cost var-cost inst-cap cap-lo cap-up wacc depreciation annuity-factor
Site In Site Out Transmission Commodity
Mid South hvac Elec 0.9 1650000 16500 0 0 0 inf 0.07 40 0.075009
South Mid hvac Elec 0.9 1650000 16500 0 0 0 inf 0.07 40 0.075009,
'storage': inst-cap-c cap-lo-c cap-up-c inst-cap-p cap-lo-p cap-up-p eff-in eff-out inv-cost-p inv-cost-c fix-cost-p fix-cost-c var-cost-p var-cost-c wacc depreciation init discharge annuity-factor
Site Storage Commodity
Mid Hydrogen Elec 0 0 inf 0 0 inf 0.64 0.64 42000 6.54 0 0.327 0.02 0 0.07 50 0.5 0.000003 0.07246
Pump storage Elec 0 60000 inf 0 8000 inf 0.94 0.94 100000 0.00 20000 0.000 0.02 0 0.07 50 0.5 0.000000 0.07246
South Hydrogen Elec 0 0 inf 0 0 inf 0.64 0.64 42000 6.54 0 0.327 0.02 0 0.07 50 0.5 0.000003 0.07246
Pump storage Elec 0 163000 inf 0 500 inf 0.94 0.94 100000 0.00 20000 0.000 0.02 0 0.07 50 0.5 0.000000 0.07246,
'demand': Mid South North
Elec Elec Elec
t
0 0.000000 0.00000 0.00000
1 43102.490062 4877.39981 11001.19176,
'supim': Mid South North
Wind Solar Hydro Wind Solar Hydro Wind Solar Hydro
t
0 0.000000 0 0.000000 0.000000 0 0.000000 0.000000 0 0.000000
1 0.935265 0 0.416194 0.457772 0 0.353497 0.602583 0 0.651799,
'buy_sell_price': Elec buy Elec sell
t
0 0.00 0.00000
1 0.08 -0.02106
'dsm': Empty DataFrame
Columns: [delay, eff, recov, cap-max-do, cap-max-up]
Index: []}
where the list ['global_prop', 'commodity,', 'process', 'process_commodity', 'transmission', 'storage', 'demand', 'supim', 'buy_sell_price', 'dsm']
is the list of the dict keys, through which I iterate to build the xarray Dataset (which I want to then save into a NetCDF file). To be specific, I get the mentioned error at the step name='transmission'
.