1

I am trying to run gpt-2 on my local machine, since google restricted my resources, because I was training too long in colab.

However, I cannot see how I can load the dataset. In the original colab notebook https://colab.research.google.com/drive/1VLG8e7YSEwypxU-noRNhsv5dW4NfTGce there is the command gpt2.copy_file_from_gdrive() which I cannot use on my local machine.

On the github repo https://github.com/minimaxir/gpt-2-simple they simply give the name of the file shakespeare.txt to the function gpt2.finetune and it works somehow, but this doesn't work for me.

Help would be much appreciated

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
Hans Geber
  • 111
  • 8

1 Answers1

1

If I read the example correctly on GitHub, it loads shakespeare.txt if it is present on the machine and downloads it if it isn't. For a local dataset, I simply drop a txt file in the same folder and call it in file_name =.

You should be able to remove the logic around if not os.path.isfile(file_name):—it shouldn't be needed if you use a local file.

tdnvl
  • 24
  • 3