0

The last iteration of my "while(res.next())"-loop takes 1.5 seconds whereas the previous iterations only take around ~0,003 milliseconds. I'm very confused.

I'm getting a ResultSet from my SQLite database with a common JDBC query. The ResultSet is not huge, it's always 4 columns wide and between 0 and 100 rows long. Then I want to save the ResultSet-data in POJOs and store those POJOs in an ArrayList. I use the usual iteration-method where you simply put up a "while(res.next())"-loop and iterate over the entire ResultSet. I recognized a bottleneck in my code and could narrow it down to the while(res.next())-loop. I started debugging and also measuring execution-times in my code and it turns out that the last .next()-call, which should return false so that the loop stops, takes a very long time.

    ArrayList<Trade> trades = new ArrayList<>();
    try
    {
      Statement statement = connector.getConnection().createStatement();
      statement.setFetchSize(1337);

      ResultSet res = statement.executeQuery("SELECT * FROM trades WHERE timestamp >= 0 AND timestamp < 1337;");
      res.setFetchSize(1337);

      int numberOfTrades = 0;
      long startTime = System.nanoTime();
      while(res.next())
      {
        long addObjectTime = System.nanoTime();
        trades.add(new Trade(res.getLong("id"),res.getLong("timestamp"),res.getDouble("price"),res.getDouble("amount")));
        numberOfTrades++;
        System.out.println("Added trade to list within "+(System.nanoTime()-addObjectTime)+"ns.");
      }
      System.out.println("List-creation completed within "+(System.nanoTime()-startTime)+"ns.");
      System.out.println("Number of trades: "+numberOfTrades);
    } catch (SQLException e)
    {
      e.printStackTrace();
    }

That's my code. As you can see I already tried playing around with various fetchSizes as other people mentioned in performance-threads concerning the .next()-method. I tried everything I could but the outcome of it all looks still like this:

Added trade to list within 46000ns.
Added trade to list within 3200ns.
Added trade to list within 2400ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 4500ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 3100ns.
Added trade to list within 2400ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 2400ns.
Added trade to list within 2400ns.
Added trade to list within 2300ns.
Added trade to list within 2200ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 2300ns.
Added trade to list within 11100ns.
List-creation completed within 1548543200ns.
Number of trades: 22

Adding a POJO to my ArrayList with the data from the ResultSet usually takes between 2-5 microseconds. So all in all, the loop shouldn't take much longer than all the execution-times for adding trades combined right? In my example that would be 0,1073 milliseconds. Instead the loop takes a total of more than 1,5 seconds which would be 10,000x the amount that I'd expect. I actually have zero clue what's happening here. And this is a severe problem for my program because the code-fragment is executed about 100,000 times, so 150,000 seconds would be 40 hours of performance-loss :(

Jonas
  • 121,568
  • 97
  • 310
  • 388
Michzor
  • 11
  • 3
  • You didn't mention anything about your environment. This is a perfect place to use a profiler to determine exactly where the time is going. – Gene Sep 02 '19 at 02:58
  • 1
    Probably because the first *N-1* results were already sent and the final one required an extra request and send. Investigate your fetch sizes. – user207421 Sep 02 '19 at 03:39
  • @Gene I got myself a profiler and the outcome oft the profiling is that JDBC3ResultSet.java:74 org.sqlite.core.NativeDB.step(long) takes up all the time. I can't navigate further into the method. I'm going to try and research what this is all about. Thanks for your input. – Michzor Sep 02 '19 at 11:38
  • @user207421 I thought about that, too. But given what the OP is doing, 1.5 seconds is far from normal. – Gene Sep 04 '19 at 03:29

1 Answers1

0

I actually solved the problem but I am not 100% sure why it's solved now. The database I was querying had many millions of entries and was always accessed by a single column (timestamp). I indexed timestamp and the performance issues vanished completely. I still don't know why the .next()-method behaved the way it did.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Michzor
  • 11
  • 3
  • Ha ha. The natural scan order of the table hit all entries satisfying the WHERE clause early on and very quickly. But all the rows had to be scanned to the end to verify there were no more. The index puts the field in sorted order, so only the timestamp range of interest must be scanned. Then scanning stops. – Gene Sep 04 '19 at 03:46
  • This is basically just a reminder of "Keep your f***in indices straight". There is so much stuff where indexing was just the problem... i think you know what i mean ;) thanks for your help my friend. – Michzor Sep 06 '19 at 10:12