0

I'm trying to train Glove https://github.com/stanfordnlp/GloVe/blob/master/src/glove.c on a pretty big dataset, the newest wikidump (22G txt file). The total # of vocab that I'm training is 1.7 mil. Every file (shuffle, cooccur, vocab_count) until glove runs smoothly without any memory error. (My RAM = 64G)

However, when I ran glove, I'm getting "Segmentation fault (core dumped)".

aerin@capa:~/Desktop/GloVe/build$ ./glove -input-file cooccurrence.shuf.bin -vocab-file vocab.txt -save-file glove300 -t-iter 25  -gradsq-file gradsq -verbose 2 -vector-size 300 -threads 1 -alpha 0.75 -x-max 100.0 -eta 0.05 -binary 2 -model 2
TRAINING MODEL
Read 1939406304 lines.
Initializing parameters...done.
vector size: 300
vocab size: 1737888
x_max: 100.000000
alpha: 0.750000
Segmentation fault (core dumped)

I tried with different # of threads as well: 1,2,4,8,16,32, etc. Nothing runs. Can someone please point me where to look?

Update

I cut the number of vocabulary from 1.7 million to 1 million and glove.c runs without "segmentation fault" error. So it is a memory error. But I would love to learn how to resolve this error and be able to train a model on the larger dataset! Any comment will be highly valued. Thanks.

Community
  • 1
  • 1
aerin
  • 20,607
  • 28
  • 102
  • 140
  • 6
    You've given no code here, so it's anyone's guess as to why it blew up. – tadman Jul 11 '18 at 20:49
  • The code was hyperlinked but here it is again: https://github.com/stanfordnlp/GloVe/blob/master/src/glove.c – aerin Jul 11 '18 at 21:15
  • 2
    Try and keep your question self-contained. If the code is important, please include it in the question itself. – tadman Jul 11 '18 at 21:33
  • 1
    Obviously there is a limit to what `glove.c` can handle. As it is on github, you may want to raise an issue there. It sounds kind of too broad for SO – you do *not* want to copy the entire source over here, nor – probably – share your giga data set. – Jongware Jul 11 '18 at 21:42
  • The file `glove.c` when compiled, outputs MANY warnings about implicit conversions (I.E. the code does not cleanly compile) – user3629249 Jul 12 '18 at 02:52
  • OT: for ease of readability and understanding: 1) follow the axiom: *only one statement per line and (at most) one variable declaration per statement.* 2) – user3629249 Jul 12 '18 at 02:55
  • 1
    OT: when calling any of the heap allocation functions: `malloc` `calloc` `realloc` always check (!=NULL) the returned value to assure the operation was successful – user3629249 Jul 12 '18 at 02:56
  • 1
    regarding: ` int num_rare_words = vocab_size < 100 ? vocab_size : 100; for (a = vocab_size - num_rare_words; a < vocab_size; a++)` the index `a` (terrible name for an index) is allowed to go all the way to `vocab_size`, but the code is only handling (properly) the (upto) the first 100 names – user3629249 Jul 12 '18 at 03:14
  • 1
    regarding: `char *word = malloc(sizeof(char) * MAX_STRING_LENGTH + 1);` the expressiion: `sizeof(char)` is defined in the C standard as 1. Multiplying anything by 1 has absolutely no effect, clutters the code, etc. Also, this is missing a check (!=NULL) of the returned value to assure the operation was successful. – user3629249 Jul 12 '18 at 03:27
  • What have you done to debug this code? Please show that effort. Suggest using a debugger, like `gdb` to find out what the code actually does (and this should have been performed before posting the question. We don't have your data files, so it is unlikely that we can reproduce your problem – user3629249 Jul 12 '18 at 03:28
  • OT: when a error indication is returned from a C library function, use `perror()` so both your error message AND the text reason the OS thinks the error occurred to `stderr` – user3629249 Jul 12 '18 at 03:36

0 Answers0