4

So I am trying to make a seq to seq model for translating german to english using pytorch on online notebook like kaggle notebook and google colab

import torch
import torch.nn as nn
import torch.optim as optim
from torchtext.datasets import Multi30k
from torchtext.data import Field, BucketIterator
import numpy as np
import spacy
import random
from torch.utils.tensorboard import SummaryWriter  # to print to tensorboard

Libraries imported, when i load dataset using the function with spacy, as below,

spacy_ger = spacy.load("de")
spacy_eng = spacy.load("en")

This error comes. : OSError: [E050] Can't find model 'de'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Everywhere, an explanation is given for 'en', but not for 'de'. If anyone could help with this.

Specification:

Package : Version

  • Spacy : 2.3.1

  • pytorch-crf : 0.7.0

  • torch : 1.5.1

  • torchnlp : 0.0.0.1

  • torchtext : 0.4.0

  • torchvision : 0.6.1

  • jupyter-tensorboard : 0.2.0

  • tensorboard : 2.2.2

  • tensorboard-plugin-wit : 1.7.0

Thanks in advance for helping.

2 Answers2

6

so after whole one month, trying on other things and exploring issues and questions related to this topic, I found a way to do so,

  import spacy.cli 
  spacy.cli.download("en_core_web_md")

With this method, you can use and import any spacy model, whether medium-sized or larger size datasets also, which always gives an error if you try to import the dataset using spacy.load because it is not effective for loading datasets other then sm or smallest size datasets in Google colab or Kaggle notebook or any other online notebook.

  • You can definitely load `md` and `lg` models in notebooks. You just need to restart your runtime after dowloading, to make sure the new packages are being registered correctly by Python. Or better yet: run your notebook in a virtual environment that you set up beforehand with the right packages installed. – Sofie VL Aug 19 '20 at 10:50
  • @Sofie VL Actually I don't have nvidia gpu installed on my pc, so I have no choice left other than online notebooks that's why I didn't used the virtual environment, else it is working fine on machine. – simarpreetsingh.019 Aug 20 '20 at 11:18
3

The accepted answer did not work for me and also the question is for German language, not English.

So for that you need to download the de files:

Run the below in terminal

python -m spacy download de

After downloading finishes, you should be able to use spacy.load("de") without any problems.

In case you are using English, then just download the English files using:

python -m spacy download en
StuckInPhDNoMore
  • 2,507
  • 4
  • 41
  • 73
  • Well actually , you are right, I gave the example of how to downlaod package for english lang. In similar way, we can download any other language's package also. For german language, you can use !python -m spacy download de_core_news_md – simarpreetsingh.019 Dec 10 '21 at 06:27