0

Please read the given problem.

You need to use subsets of the original cats_vs_dogs data, which is entirely in the 'train' split. I.E. 'train' contains 25000 records with 1738 corrupted images to in total you have 23,262 images.

You will split it up to get

  • The first 10% is as the 'new' training set
  • The last 10% is as the new validation and test sets, split down the middle (i.e. the first half of the last 10% is validation (first 5%) the second half is test (last 5%))

These 3 recordsets should be called train_examples, validation_examples and test_examples respectively.

Note: Remember to use cats_vs_dogs:4.. as dataset because 4.0 support the new Splits API.


I wrote code for the corresponding code as follows:

splits = ['train[:10%]', 'train[-10% :-5%]', 'train[-5%:]']

splits, info = tfds.load('cats_vs_dogs:4.*.*', split=splits, data_dir=filePath, with_info=True)

(train_examples, validation_examples, test_examples) = splits
    
train_len = len(list(train_examples))
validation_len = len(list(validation_examples))
test_len = len(list(test_examples))
print(train_len)
print(validation_len)
print(test_len)

I ran the above code and got the following error.

AssertionError: Unrecognized instruction format: train[-10% :-5%]

Please help me out with proper split.

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
Fawad
  • 1
  • 2

1 Answers1

1

Just remove the space there:

'train[-10% :-5%]'
           ^
       remove me

Corrected:

'train[-10%:-5%]'
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143