0

I am trying to run an OLS regression, the code works fine with an exception that it concatenates the variables into one. I have tried looking into what could be the problem but I've not been able to find the reason

@Component
public class AppRunner implements CommandLineRunner{

    private static final String DATA_FILE =  "/path/to/datafile.csv";

    @Override
    public void run(String... args) throws Exception {
        buildModel(DATA_FILE, 2304);
    }

    public static OLSMultipleLinearRegression buildModel( String filenames, int SIZE ) throws IOException {
        double[] Y = new double[SIZE];
        double[][] X = new double[SIZE][];

        BufferedReader bufferedReader = new BufferedReader(new FileReader(filenames));
        try {
            String record;
            int index = 0;
            while ((record = bufferedReader.readLine()) != null) {
                String[] tokens = StringUtils.split(record, ",");
                Y[index] = Double.parseDouble(tokens[0]);
                double[] features = new double[tokens.length - 1];

                for (int i = 0; i < features.length; i++) {
                    features[i] = Double.parseDouble(tokens[i + 1]);
                }
                X[index] = features;
                index++;
            }
        } finally {
            bufferedReader.close();
        }
        OLSMultipleLinearRegression regression = new OLSMultipleLinearRegression();
        regression.newSampleData(Y, X);
        regression.setNoIntercept(false);
        System.out.println(regression);
        return regression;
    }

}

It throws the below error

Caused by: java.lang.NumberFormatException: For input string: "226.12000000,160.99000000,50.69000000"
    at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054) ~[na:na]
    at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110) ~[na:na]
    at java.base/java.lang.Double.parseDouble(Double.java:543) ~[na:na]
    at io.christdoes.wealth.generator.cmdrunner.AppRunner.buildModel(AppRunner.java:42) ~[classes/:na]
    at io.christdoes.wealth.generator.cmdrunner.AppRunner.run(AppRunner.java:25) ~[classes/:na]
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:784) [spring-boot-2.2.1.RELEASE.jar:2.2.1.RELEASE]
    ... 5 common frames omitted

From what I could read, it seems to be concatenating the values instead of reading one-by-one.

ken4ward
  • 2,246
  • 5
  • 49
  • 89
  • 1
    StringUtils.split(",") apparently doesn't doesn't split correctly. Use your debugger to find out what the string contain before being split, and what the split() method returns. Check its source code. – JB Nizet Nov 28 '19 at 20:40
  • Seems `StringUtils.split(s, sep)` isn't doing what you think it does, because if it had done the same as `s.split(sep)`, then that error would be impossible. You should look at what `StringUtils.split(s, sep)` actually does. Since you didn't share it with us, we can't help with that. – Andreas Nov 28 '19 at 20:41

0 Answers0