-1

I need to browse through around 3000 folders, each folder contains 300 CSV files.

This is the error that occurs at line while ((nextLine=csvReader.readNext()) != null):

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at au.com.bytecode.opencsv.CSVParser.parseLine(CSVParser.java:206)
    at au.com.bytecode.opencsv.CSVParser.parseLineMulti(CSVParser.java:174)
    at au.com.bytecode.opencsv.CSVReader.readNext(CSVReader.java:237)
    at DA.readTelemetryData(DA.java:78)
    at DA.main(DA.java:24)

The question is how to solve this issue? Why does it occur and what is wrong in my code?

Here I provide the code:

private static HashMap<Integer,HashMap<Integer,List<double[]>>> readTelemetryData() throws Exception
    {
        HashMap<Integer,HashMap<Integer,List<double[]>>> xy_total = new HashMap<Integer,HashMap<Integer,List<double[]>>>(); 

        for (int i=0; i<Constants.MAX_FOLDERS; i++)
        {
            HashMap<Integer,List<double[]>> xy_total_per_folder= new HashMap<Integer,List<double[]>>();
            for (int j=0; j<Constants.MAX_FILES_INSIDE_FOLDER; j++)
            {               
                CSVReader csvReader = null;
                File f = new File("data/"+ (i+1) +"/"+ (j+1) +".csv");
                if(f.exists())
                {
                    csvReader = new CSVReader(new FileReader(f));
                    List<double[]> xyArr = new ArrayList<double[]>();
                    String[] firstLine=csvReader.readNext();
                    if (firstLine != null) 
                    {
                      String[] nextLine=null;
                      while ((nextLine=csvReader.readNext()) != null) 
                      {
                          double[] d = new double[2];
                          d[0]=Double.parseDouble(nextLine[0]);
                          d[1]=Double.parseDouble(nextLine[1]);
                          xyArr.add(d);
                      }
                    }

                    xy_total_per_folder.put(j, xyArr);

                    csvReader.close();
                }
            }
            xy_total.put(i, xy_total_per_folder);
        }
        return xy_total;
    }
Klausos Klausos
  • 15,308
  • 51
  • 135
  • 217

2 Answers2

2

You are running out of memory.

HashMap<Integer,V> is a rather bad choice. It needs 16 bytes for the key, and probably 24 byte for the entry each + dead space. Your double[] then needs 32 bytes (for storing 16 bytes of payload). In the array list, you need another 4 bytes for the reference...

So each line will cost you 36 bytes instad of 16, for example.

Consider using more compact data structures. GNU Trove is a library offering great collections for primitive types; but don't underestimate the value of arrays...

For processing large amount of primitive types (int, double etc.) stay away from java.util. collections. Instead, spend extra time on organizing your memory.

For example, you could use Trove's TDoubleArrayList, one for all the x and one for all the y values, instead of using one array for each line. When finished reading the file, you can convert them to minimal double[] x; double[] y; arrays, and reuse the TDoubleArrayList for parsing the next file.

Last but not least, Java by default only uses 25% of your memory. use -Xmx to increase the limit.

Run a memory profiler. Where is most of the memory allocated? Is all of this needed? Maybe that CSVReader you are using has a memory leak! Using a memory profiler is an easy way to find out.

But do the math. How many lines do you have - can you fit all of them into memory?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

There is generally two reason of such behavior:

  1. Memory leak. It mean that you program store data that not needed any more. Analyze memory dump to fix.

  2. Not enough memory because you program actually need that much memory. Yo can simply give it more memory. Or try to change you algorithms and data structures.

talex
  • 17,973
  • 3
  • 29
  • 66