I have a data file of 50,000,000 lines and need to bootstrap each line using Java. Right now I am using Math.random() to generate random number, then do the bootstrapping in a brute force way. However, it will take me forever. So i am wondering if there is any Java library help to do this efficiently or should i call other languages inside Java? Anyway my goal is to optimize the whole process. Thank you!
Asked
Active
Viewed 886 times
0
-
What is "bootstrap each lines"? – user2864740 Apr 27 '14 at 03:52
1 Answers
0
Provided the entire dataset fits in memory (which might be feasible on a typical high-end laptop with say, 8 GB of RAM, assuming the individual lines of your file are not too long) then you might be able to use the Resample
java class from Weka. The Resample
class comes in both a supervised as well as an unsupervised version. You can download Weka here.

stachyra
- 4,423
- 4
- 20
- 34