Keras implementation of skip-gram word embeddings is very slow

Asked Jun 23 '17 at 17:58

Active Sep 05 '17 at 04:18

Viewed 1,020 times

I am trying to train skip-gram word embeddings using the example posted at https://github.com/nzw0301/keras-examples/blob/master/Skip-gram-with-NS.ipynb

on a GPU GeForce GTX 1080 using the english Wikipedia (~100M sentences).

The training time is extremely slow ~estimated 27 days / epoch with a vocab of size 50k which is a little strange for that very simple model. I am using CUDA 8 and CUDNN 5.1. The backend is tensorflow 1.2.0 & I am using keras 2.0.2. I was wondering if anyone trained a skip-gram model with a keras implementation before? Any thoughts why the implementation above is very slow? I made sure the preprocessing is not the major issue. Thanks,

asked Jun 23 '17 at 17:58

aelgohary

any answers found ? – bicepjai Aug 11 '17 at 08:06
I'm not sure if it makes a difference, but that implementation is not the same as the established equations for word2vec. Particularly on how the negative samples are collected and factored into the loss. – SantoshGupta7 Apr 06 '19 at 06:25

Keras implementation of skip-gram word embeddings is very slow

0 Answers0