3

I'm trying to write some code to get a Mallet Instance List file into a document topics matrix in R. To do this, I read the instance list file into a topic trainer variable called 'topic.model'. Below is the function call I am making to create a document topics matrix in R:

theta <- mallet::mallet.doc.topics(topic.model, smoothed = TRUE, normalized = TRUE)

I got this working on a smaller instance list file (< 1gb), but for a larger instance list (~15gb) I receive the following error:

Error in .jcall(wrapper, "[D", "flat_double") :
java.lang.NegativeArraySizeException
Calls: myfunc ... .jevalArray -> newArray -> structure -> .jcall -> .jcheck
Execution halted

I suspect that the somewhere there is an integer overflow, in which INT_MAX is exceeded, and the NegativeArraySizeException occurs. Interestingly, using the command line, Mallet was able to make the document topics file using the --output-doc-topics parameter (>150gb). Any suggestions would be greatly appreciated.

LocoGris
  • 4,432
  • 3
  • 15
  • 30
mootechs
  • 41
  • 1

0 Answers0