I try to train a Yolo Net with my custom Dataset. I have some Images (*.jpg) and the labels/annotations in the yolo format as a txt-file.
Now I want to split the data in a train and validation set. As a result I want a train and a validation folder each with their own images and annotations.
I tried something like this:
from sklearn.model_selection import train_test_split
import glob
# Get all paths to your images files and text files
PATH = '../TrainingsData/'
img_paths = glob.glob(PATH+'*.jpg')
txt_paths = glob.glob(PATH+'*.txt')
X_train, X_test, y_train, y_test = train_test_split(img_paths, txt_paths, test_size=0.3, random_state=42)
After saving the set to a new folder, the images and annotations got mixed up. So for example in the train folder, some images had no annotation (they were in the validation folder) and there were some annotaions but the image was missing.
Can you help me to split my dataset?