I am working in a cross-cultural language study involving English and Indonesian participants.
In the English participants, I successfully load the pre-trained word2vec from google news corpus (file: GoogleNews-vectors-negative300.bin).
I was wondering because I cannot load the google news corpus for the Indonesian language. (file: id.bin, file source: https://github.com/Kyubyong/wordvectors).
Here is the working code:
import gensim
from gensim import models
from gensim.models import Word2Vec
import math
import sys
import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')
model = gensim.models.word2vec.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
Here is the not working code:
import gensim
from gensim import models
from gensim.models import Word2Vec
import math
import sys
import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')
model = gensim.models.word2vec.Word2Vec.load_word2vec_format('id.bin', binary=True)
What is the correct way to do this?