You typically use our ETL library datavec for that. I'm not sure where you were looking, but the examples have numerous examples of pre processing data in csv, image and text. It depends on what you're doing.
For CSV, you found the right starting point. That will load from a directory of CSVs.
In our case with one of the examples in there I'm citing:
int numLinesToSkip = 0;
char delimiter = ',';
String localDataPath = DownloaderUtility.IRISDATA.Download();
RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
recordReader.initialize(new FileSplit(new File(localDataPath,"iris.txt")));
int labelIndex = 4;
int numClasses = 3;
DataSetIterator iteratorA = new RecordReaderDataSetIterator(recordReaderA,10,labelIndex,numClasses);
This will setup a record reader for parsing the data, you initialize it to point that reader at a particular file or directory (that's data that can be anything)
If you want something more complex, you typically either hand code the pipeline yourself or use datavec's transform process.
It really depends on your use case.
As for your specific problem with a NumberFormatException, I'm not really sure what to say.
As anyone on here would ask for, I'd need the complete context (the stack trace, full error message not a partial description,..)
Going on what I have, it's probably because you're tossing in words or something that's not a number. All machine learning involves converting everything (doesn't matter what) to numbers. I don't want to do a whole ML course in 1 post, but if you can be more specific I can give you hints as to what you need to do for your particular case.