0

I have a data file of 50,000,000 lines and need to bootstrap each line using Java. Right now I am using Math.random() to generate random number, then do the bootstrapping in a brute force way. However, it will take me forever. So i am wondering if there is any Java library help to do this efficiently or should i call other languages inside Java? Anyway my goal is to optimize the whole process. Thank you!

Anantha Raju C
  • 1,780
  • 12
  • 25
  • 35
Luna Park
  • 1
  • 1

1 Answers1

0

Provided the entire dataset fits in memory (which might be feasible on a typical high-end laptop with say, 8 GB of RAM, assuming the individual lines of your file are not too long) then you might be able to use the Resample java class from Weka. The Resample class comes in both a supervised as well as an unsupervised version. You can download Weka here.

stachyra
  • 4,423
  • 4
  • 20
  • 34