4

I am getting problems when migrating to a NetCDF data format from HDF5 for storing a dictionary of pandas DataFrames, which contain the data and results of a pyomo model.

The current HDF5 saving script, which functions without a problem is as follows:

import pandas as pd
def save(prob, filename):
    with pd.HDFStore(filename, mode='w') as store:
        for name in prob._data.keys():
            store['data/'+name] = prob._data[name]
        for name in prob._result.keys():
            store['result/'+name] = prob._result[name]

where prob is a solved pyomo model instance.

Since we are migrating to PyPy in our project for runtime reasons, which does not have support for h5py at the moment, we also want to shift to NetCDF rather than HDF5 for storing our model instances.

For this I use xarray Datasets, which seem to be compatible with the NetCDF format:

import xarray as xr
def save(prob, filename):
    ds = xr.Dataset()
    for name in prob._data.keys():
        ds['data/'+name] = prob._data[name]
    for name in prob._result.keys():
        ds['result/'+name] = prob._result[name]         
    ds.to_netcdf(filename) 

Despite looking quite analogous to the preceding HDF5 script, I get the following error here:

  urbs.save(prob, os.path.join(result_dir, '{}.nc'.format(sce)))
File "/home/scandas/nas/pypy_for_asinus/urbs_pypy/urbs/saveload.py", line 63, in save
  ds['data/'+name] = prob._data[name]
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 899, in __setitem__
  self.update({key: value})
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 2305, in update 
  variables, coord_names, dims = dataset_update_method(self, other)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 580, in dataset_update_method
  indexes=dataset.indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/merge.py", line 434, in merge_core
  aligned = deep_align(coerced, join=join, copy=False, indexes=indexes)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 213, in deep_align
  exclude=exclude)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 164, in align
  new_obj = obj.reindex(copy=copy, **valid_indexers)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataarray.py", line 906, in reindex
  indexers=indexers, method=method, tolerance=tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/dataset.py", line 1812, in reindex
  tolerance, copy=copy)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/alignment.py", line 324, in reindex_variables
  int_indexer = get_indexer_nd(index, target, method, tolerance)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/xarray/core/indexing.py", line 117, in get_indexer_nd
  flat_indexer = index.get_indexer(flat_labels, **kwargs)
File "/home/scandas/nas/pypy_for_asinus/pypy3-v6.0.0-linux64/site-packages/pandas/core/indexes/multi.py", line 2042, in get_indexer
  indexer = self._engine.get_indexer(target)
File "pandas/_libs/index.pyx", line 654, in pandas._libs.index.BaseMultiIndexCodesEngine.get_indexer
ValueError: operands could not be broadcast together with shapes (244,3) (4,) (244,3)

It looks like for some keys in _data of prob, there is mismatch of shapes (during concatenation?), creating error when assigning the elements of the xarray Dataset. However, the HDF5 storing procedure with similar assignments works smoothly without any error.

Edit: prob._data dictionary looks as follows:

{'global_prop':                value                                        description
Property                                                               
CO2 limit  150000000  Limits the sum of all created (as calculated b..., 'site':              area
Name             
Mid     280000000
South  5000000000, 
'commodity':                         price  max  maxperhour
Site  Commodity Type                          
Mid   Biomass   Stock     6.0  inf         inf
      CO2       Env       0.0  inf         inf
      Coal      Stock     7.0  inf         inf
      Elec      Demand    NaN  NaN         NaN
      Gas       Stock    27.0  inf         inf
      Hydro     SupIm     NaN  NaN         NaN
      Lignite   Stock     4.0  inf         inf
      Slack     Stock   999.0  inf         inf
      Solar     SupIm     NaN  NaN         NaN
      Wind      SupIm     NaN  NaN         NaN
South Biomass   Stock     6.0  inf         inf
      CO2       Env       0.0  inf         inf
      Coal      Stock     7.0  inf         inf
      Elec      Demand    NaN  NaN         NaN
      Elec buy  Buy       1.0  inf         inf
      Elec sell Sell      3.0  inf         inf
      Gas       Stock    27.0  inf         inf
      Hydro     SupIm     NaN  NaN         NaN
      Lignite   Stock     4.0  inf         inf
      Slack     Stock   999.0  inf         inf
      Solar     SupIm     NaN  NaN         NaN
      Wind      SupIm     NaN  NaN         NaN, 
'process':              inst-cap  cap-lo  cap-up  max-grad  min-fraction  inv-cost  fix-cost  var-cost  wacc  depreciation  area-per-cap  annuity-factor
Site  Process                                                                                                                                           
Mid   Biomass plant            0       0    5000  1.200000          0.00    875000     28000      1.40  0.07            25           NaN        0.085811
      Gas plant                0       0   80000  4.800000          0.25    450000      6000      1.62  0.07            30           NaN        0.080586
      Hydro plant              0       0    1400       inf          0.00   1600000     20000      0.00  0.07            50           NaN        0.072460
      Lignite plant            0       0   60000  0.900000          0.65    600000     18000      0.60  0.07            40           NaN        0.075009
      Photovoltaics            0   15000  160000       inf          0.00    600000     12000      0.00  0.07            25       14000.0        0.085811
      Slack powerplant    999999  999999  999999       inf          0.00         0         0    100.00  0.07             1           NaN        1.070000
      Wind park                0       0   13000       inf          0.00   1500000     30000      0.00  0.07            25           NaN        0.085811
South Biomass plant            0       0    2000  1.200000          0.00    875000     28000      1.40  0.07            25           NaN        0.085811
      Coal plant               0       0  100000  0.600000          0.50    600000     18000      0.60  0.07            40           NaN        0.075009
      Feed-in                  0       0    1500       inf          0.00         0         0      0.00  0.07             1           NaN        1.070000
      Gas plant                0       0  100000  4.800000          0.25    450000      6000      1.62  0.07            30           NaN        0.080586
      Hydro plant              0       0       0       inf          0.00   1600000     20000      0.00  0.07            50           NaN        0.072460
      Photovoltaics            0   20000  600000       inf          0.00    600000     12000      0.00  0.07            25       14000.0        0.085811
      Purchase                 0       0    1500       inf          0.00         0        80      0.00  0.07             1           NaN        1.070000
      Slack powerplant    999999  999999  999999       inf          0.00         0         0    999.00  0.07             1           NaN        1.070000
      Wind park                0       0  200000       inf          0.00   1500000     30000      0.00  0.07            25           NaN        0.085811, 
'process_commodity':                   ratio  ratio-min
Process          Commodity Direction                   
Biomass plant    Biomass   In         1.0000        NaN
                 CO2       Out        0.0000        NaN
                 Elec      Out        0.3500        NaN
Coal plant       CO2       Out        0.3000        NaN
                 Coal      In         1.0000        1.4
                 Elec      Out        0.4000        NaN
Feed-in          Elec      In         1.0000        NaN
                 Elec sell Out        1.0000        NaN
Gas plant        CO2       Out        0.2000        NaN
                 Elec      Out        0.6000        NaN
                 Gas       In         1.0000        1.2
Hydro plant      Elec      Out        1.0000        NaN
                 Hydro     In         1.0000        NaN
Lignite plant    CO2       Out        0.4000        NaN
                 Elec      Out        0.4000        NaN
                 Lignite   In         1.0000        2.0
Photovoltaics    Elec      Out        1.0000        NaN
                 Solar     In         1.0000        NaN
Purchase         CO2       Out        0.0005        NaN
                 Elec      Out        1.0000        NaN
                 Elec buy  In         1.0000        NaN
Slack powerplant CO2       Out        0.0000        NaN
                 Elec      Out        1.0000        NaN
                 Slack     In         1.0000        NaN
Wind park        Elec      Out        1.0000        NaN
                 Wind      In         1.0000        NaN, 
'transmission':                          eff  inv-cost  fix-cost  var-cost  inst-cap  cap-lo  cap-up  wacc  depreciation  annuity-factor
Site In Site Out Transmission Commodity                                                                                                 
Mid     South    hvac         Elec       0.9   1650000     16500         0         0       0     inf  0.07            40        0.075009
South   Mid      hvac         Elec       0.9   1650000     16500         0         0       0     inf  0.07            40        0.075009, 
'storage':                    inst-cap-c  cap-lo-c  cap-up-c  inst-cap-p  cap-lo-p  cap-up-p  eff-in  eff-out  inv-cost-p  inv-cost-c  fix-cost-p  fix-cost-c  var-cost-p  var-cost-c  wacc  depreciation  init  discharge  annuity-factor
Site  Storage      Commodity                                                                                                                                                                                                              
Mid   Hydrogen     Elec                0         0       inf           0         0       inf    0.64     0.64       42000        6.54           0       0.327        0.02           0  0.07            50   0.5   0.000003         0.07246
      Pump storage Elec                0     60000       inf           0      8000       inf    0.94     0.94      100000        0.00       20000       0.000        0.02           0  0.07            50   0.5   0.000000         0.07246
South Hydrogen     Elec                0         0       inf           0         0       inf    0.64     0.64       42000        6.54           0       0.327        0.02           0  0.07            50   0.5   0.000003         0.07246
      Pump storage Elec                0    163000       inf           0       500       inf    0.94     0.94      100000        0.00       20000       0.000        0.02           0  0.07            50   0.5   0.000000         0.07246, 
'demand':   Mid       South        North
           Elec        Elec         Elec
t                                       
0      0.000000     0.00000      0.00000
1  43102.490062  4877.39981  11001.19176,
'supim':                Mid                     South                     North                
       Wind Solar     Hydro      Wind Solar     Hydro      Wind Solar     Hydro
t                                                                              
0  0.000000     0  0.000000  0.000000     0  0.000000  0.000000     0  0.000000
1  0.935265     0  0.416194  0.457772     0  0.353497  0.602583     0  0.651799, 
'buy_sell_price':   Elec buy Elec sell
t                   
0                       0.00   0.00000
1                       0.08  -0.02106
'dsm': Empty DataFrame
Columns: [delay, eff, recov, cap-max-do, cap-max-up]
Index: []}

where the list ['global_prop', 'commodity,', 'process', 'process_commodity', 'transmission', 'storage', 'demand', 'supim', 'buy_sell_price', 'dsm'] is the list of the dict keys, through which I iterate to build the xarray Dataset (which I want to then save into a NetCDF file). To be specific, I get the mentioned error at the step name='transmission'.

scandas
  • 71
  • 4
  • Could you kindly share what `prob._data[name]` and `prob._result[name]` look like for your data? (I'm familiar with xarray but not pyomo.) – shoyer Jul 03 '18 at 17:15
  • I have included the contents of ´prob._data´ on my question. – scandas Jul 04 '18 at 14:38
  • OK, this is still a little tricky to debug in the abstract. But please file a bug report (on the xarray GitHub page) if you can come up with a minimal example: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports – shoyer Jul 06 '18 at 21:35

0 Answers0