1

In using the GroupKFold method from scikit-learn I am getting an error message which I can not understand given the documentation.

The error message is:

ValueError: too many values to unpack (expected 2)

The documentation states:

enter image description here

For a reproducible example:

from sklearn.model_selection import GroupKFold

X1 = np.random.randint(1, 10, size = (100, 2))

groups1 = np.random.choice([1,2,3, 4, 5], size = 100, p = [.15, .2, .3, .15, .2])


gkf1 = GroupKFold(5)

train, test = gkf1.split(X = X1, groups = groups1 )

Which yields the following error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-56-911681dea183> in <module>
      8 gkf1 = GroupKFold(5)
      9 
---> 10 train, test = gkf1.split(X = X1, groups = groups1 )

ValueError: too many values to unpack (expected 2)
halfer
  • 19,824
  • 17
  • 99
  • 186
user8270077
  • 4,621
  • 17
  • 75
  • 140

1 Answers1

1

The split function returns a generator. You will have to iterate over the generator to get the train and test groups.

Like shown in the example

for train_index, test_index in gkf1.split(X, y, groups):
abhilb
  • 5,639
  • 2
  • 20
  • 26