TypeError: Trying to split data randomly in training and test set

Question

I want to take the first 70% of my shuffeled data as training data and the rest as test data, but I receive that strange error.

I have looked at other code examples with that error but don't get it, sorry.

import numpy as np

segment_relative_path = ["a", "b", "c", "d", "e", "f"]
idx = np.random.permutation(len(segment_relative_path))
train_data = segment_relative_path[idx[:int(0.7*len(idx))]]

This gives:

TypeError: only integer scalar arrays can be converted to a scalar index.

What do I have to change to avoid that error?

This will work - your code has a syntax error. `train_data = segment_relative_path[:int(.7*len(idx))]` — rhn89, Apr 12 '19 at 19:02
Thanks for your anser, but what I basically want to do is this: https://stackoverflow.com/questions/43229034/randomly-shuffle-data-and-labels-from-different-files-in-the-same-order — Anno, Apr 12 '19 at 19:25
The answer you mentioned is just randomizing the data [independent and dependent variables], not splitting it into train/test. — rhn89, Apr 12 '19 at 19:41
Yes I want to randomize the data and the use the first 70% of the randomized data for the training set. — Anno, Apr 12 '19 at 19:45

score 0 · Answer 1 · answered Apr 13 '19 at 03:02

0

You are trying to use an index array for a list. Lists accept only scalar integer indices. However if you convert segment_relative_path into an array, it will work:

import numpy as np

segment_relative_path = ["a", "b", "c", "d", "e", "f"]
idx = np.random.permutation(len(segment_relative_path))
train_data = np.array(segment_relative_path)[idx[:int(0.7*len(idx))]]

answered Apr 13 '19 at 03:02

Amir Hajibabaei

148
8

That's it! Thank you! – Anno Apr 13 '19 at 09:44

TypeError: Trying to split data randomly in training and test set

1 Answers1