1

I have a folder with 48 ECG signal files. The files include .dat and .atr ECG signal records and annotation. I want to split them to train and test to train the AI model. I will be using PyTorch and I want to know a simple way to do this in Python.I prefer a custom split with certain number of files to be in train and the rest in test.

Eg: Train : ['101', '104','107'] Test : ['102', '105','106']

Thanks

Fathima
  • 13
  • 5

1 Answers1

0

Here first you need to store the Input and attribute location using
a dictionary in python with Input file name as key and Attribute file name as Value.

Then you can split the key of the dictionary and use that as input.

from glob import glob

MainFolder="<Your Folder Name>"

Data={}
for file in glob(MainFolder+"/*.dat"):
   At_file=file[:-3]+"atr"
   Data[file]=At_file

# Here Data would have Input and attribute file name as key and value pair

# To split the date: 

Key_data=list(Data)
import random
random.shuffle(Key_data)

#Here you specify the split ratio of Training and Testing
split=int(len(Key_data)*(0.8))

Train_in=Key_data[:split]
Test_in=Key_data[split:]
Train_at=[Data[i] for i in Train_in]
Test_at=[Data[i] for i in Test_in]

print(Train_in,Train_at,Test_in,Test_at)

Here Train_in is the Input files and Train_at is its corresponding attribute files

This should solve your problem. Comment if you get any error in implementing the above code.

Kalyan Reddy
  • 326
  • 1
  • 8
  • Thank you, Kalyan. The code works. However, it only splits the .dat files. not the annotation files(.atr) . This is the dataset I am trying to use --> https://physionet.org/content/mitdb/1.0.0/ – Fathima May 04 '22 at 09:27
  • @Fathima You can get the annotation file name using Data dictionary, ex: Data[i] where i is element in the Train or Test list. I think this would answer you question. Do accept and Upvote the answer if this solves your problem. – Kalyan Reddy May 04 '22 at 10:42
  • @Fathima See the edits, I printed both Input and attribute files, But I recommned you to use the Train_in list and while you iterate over Train_in during training use the Data[i] to get the corresponding attribute for the element in Train_in. – Kalyan Reddy May 04 '22 at 12:11
  • @Fathima Do mark the answer as correct if it answers your question and upvote if you like it :)) – Kalyan Reddy May 04 '22 at 17:17