0

I am trying to develop model using PLSR (Partial Least Squares Regression) in Python3 using code provided https://github.com/pgbrodrick/ensemblePLSR. Sample data is also provided.

When I try to run code, it gives me error

>>> python3 ensemble_plsr.py example_settings.txt

I am using Python (3.7.3), python modules scikit-learn (0.20.2) and pandas (0.23.3).

/usr/lib/python3/dist-packages/sklearn/externals/joblib.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
n bad bands 57
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py", line 3078, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: -1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ensemble_plsr.py", line 173, in <module>
    df.pop(col)
  File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 760, in pop
    result = self[item]
  File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 2491, in _get_item_cache
    values = self._data.get(item)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: -1
MGD
  • 1
  • 6

1 Answers1

0

In short, you're attempting to remove columns for which there is no reference at row 173 of ensemble_plsr.py, hence the KeyError during execution. What's happening under the hood is that when Python attempts to execute the pop method on the DataFrame for the unspecified/non-existent column, it raises that error. There are different ways to resolve this but this solution will resolve the error you are seeing:

Replace rows 172 & 173 in ensemble_plsr.py with the following:

for col in sf.get_setting('ignore columns'):
    if col != 'nothing_here':
      df.pop(col)

Replace row 16 in example_settings.txt with the following:

ignore columns(any other columns to remove) = nothing_here

Good news, you're done with this issue. Bad news, you're going to hit the next error down the line but you're on your way!

benfarr
  • 135
  • 1
  • 6
  • This solution worked and yes stuck at new error. I'm not a programming guy, it would require much time to correct errors. – MGD Mar 12 '20 at 03:55