3

Background: I am attempting to determine statistics for ambulances using Apache Commons Math. I am able to do very basic univariate statistics for one ambulance, however I get stuck when I want to determine statistics for all the ambulances in my fleet.

Goal: My goal is to generate a basic result set using JDBC, then parse the information out into statistical information. For example, I would like to take the result set and make it look like a table showing the Ambulance, average for 2014, average for 2015 to be the header. The the table details would show each ambulance and the averages for each header

<table>
<tr><th>ambulance</th><th>average response time for year 2014</th><th>average response time for year 2015</th></tr>
<tr><td>Medic1</td><td>62</td><td>74</td></tr>
<tr><td>Medic2</td><td>83</td><td>79</td></tr>
<tr><td>Medic3</td><td>68</td><td>71</td></tr>
</table>

Attempted pseudocode: The pseudocode would look something like this; 1.) Assign a variable for the calendar year 2014 average response time. 2.) loop through all the ambulances in the resultset if the calendar year is 2014 then calculate an average. 3.) assign a variable for the calendar year 2015 average response time. 4.) loop through all the ambulances and if the calendar year is 2015 then calculate the average. 5.) output the Ambulance, average response time for 2014, average response time for 2015

Comments: This would be a good start. At least the logic and format would be present to do more sophisticated analysis such as determine the differences year over year. But I'm stuck. I'm not sure how to do an iteration over each ambulance to generate the average.

I am able to write SQL queries that would generate averages for each ambulance. But I want to use Apache Commons Math because it offers Skew, Kurtosis and other measures. What you see above this paragraph is a simplified example of something more complex.

Java code:

package EMSResearch;

import java.sql.*;
import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics;

public class EMSResearch
{

    public static void main(String[] args)
    {
        Connection conn = null;
        Statement stmt = null;
        try
        {
            conn = DriverManager.getConnection("jdbc:sqlserver://MyDatabase;database=Emergencies;integratedsecurity=false;user=MyUserName;password=MyPassword");
            stmt = conn.createStatement();
            String strSelect = "SELECT EmergencyID, YearOfCall, ResponseTime, Ambulance";
            ResultSet rset = stmt.executeQuery(strSelect);

            DescriptiveStatistics ds = new DescriptiveStatistics();
/*the following code does the job of generating average response time for Medic1 for year 2015. But I want it to loop through and get all the ambulances for year 2015*/
            while (rset.next())
            {
                if (rset.getString("Ambulance").equals("Medic1") && rset.getInt("YearOfCall") == 2015)
                {
                    String event = rset.getString("I_EventNumber");
                    int year = rset.getInt("YearOfCall");
                    int responseTime = rset.getInt("ResponseTime");
                    String truck = rset.getString("Ambulance");
                    ds.addValue(responseTime);
                }
            }
            System.out.println("mean average value " + ds.getMean());


        } catch (SQLException ex)
        {
            ex.printStackTrace();
        } finally
        {
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
David Fort Myers
  • 333
  • 1
  • 5
  • 17
  • Consider using the [SQL window functions](https://msdn.microsoft.com/en-us/library/ms189461%28v=sql.110%29.aspx) instead. – Mark Rotteveel May 26 '16 at 14:11
  • That is what I usually do, However in this case I wanted to use Apache Commons Math because it has methods for Skew and Kurtosis. Also, once I get proficient I can do a lot more cool statistical functions. – David Fort Myers May 26 '16 at 14:22

2 Answers2

0

SOmething like this might help. If you use a Map to store all the data for all the years and trucks, you can get all you need I think. THis code is not fully baked, but it's pretty sweet in concept I think.

  private static void getstats(ResultSet rset) throws SQLException {
    Map<Integer, Map<String, DescriptiveStatistics>> stats = new HashMap<>();
    while (rset.next()) {

      String event = rset.getString("I_EventNumber");
      int year = rset.getInt("YearOfCall");
      int responseTime = rset.getInt("ResponseTime");
      String truck = rset.getString("Ambulance");
      if (stats.containsKey(year)) {
        Map<String, DescriptiveStatistics> get = stats.get(year);
        if (get.containsKey(truck)) {
          get.get(truck).addValue(responseTime);
        } else {
          Map<String, DescriptiveStatistics> newmap = new HashMap<>();
          DescriptiveStatistics newDs = new DescriptiveStatistics();
          newDs.addValue(responseTime);
          newmap.put(truck, newDs);
        }

      } else {

        Map<String, DescriptiveStatistics> newmap = new HashMap<>();
        DescriptiveStatistics newDs = new DescriptiveStatistics();
        newDs.addValue(responseTime);
        newmap.put(truck, newDs);
        stats.put(year, newmap);
      }

    }
    for(Integer year : stats.keySet()){
      for(String truck : stats.get(year).keySet()){
        DescriptiveStatistics ds = stats.get(year).get(truck);
        /**do stuff with the ds for this year and this truck**/

      }
    }

  }
Mark Giaconia
  • 3,844
  • 5
  • 20
  • 42
  • Thanks markg: I am going to pull out my college text books and read up on maps. I am only a novice programmer. By the way, I think Tommy Boy was a hilarious movie. – David Fort Myers May 26 '16 at 14:13
  • Maps are just key->value structures. The get(...) method takes the key and returns the value, whatever you set it too in your generic type args. Maps are powerful, I recommend you get to know them! – Mark Giaconia May 26 '16 at 14:44
0

As markg said, a Map will help you out greatly. Just to add a little bit onto it though, I'd also group your data in a meaningful manner. For instance, your current implementation contains this:

DescriptiveStatistics ds = new DescriptiveStatistics();

while (rset.next())
{
    if (rset.getString("Ambulance").equals("Medic1") && rset.getInt("YearOfCall") == 2015)
    {
        String event = rset.getString("I_EventNumber");
        int year = rset.getInt("YearOfCall");
        int responseTime = rset.getInt("ResponseTime");
        String truck = rset.getString("Ambulance");
        ds.addValue(responseTime);
    }
}

What you're essentially doing now is determining if the data meets a specific criteria, add it to your single dataset. But if you wanted to check another criteria, you'd need to initialize another dataset, add another if statement, copy the code down there; it isn't scalable.

Instead, consider creating an object that you can use group your data by:

public class DataPoint {
    // Consider private members with public getters/setters.
    public String ambulance;
    public int year;

    public DataPoint(String ambulance, int year) {
        this.ambulance = ambulance;
        this.year = year;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result
                + ((ambulance == null) ? 0 : ambulance.hashCode());
        result = prime * result + year;
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        DataPoint other = (DataPoint) obj;
        if (ambulance == null) {
            if (other.ambulance != null)
                return false;
        } else if (!ambulance.equals(other.ambulance))
            return false;
        if (year != other.year)
            return false;
        return true;
    }
}

The hashCode() and equals() overrides are important, but are tangential to this discussion. Basically, they make sure that the Map can find and determine that two different objects with the same parameters are equal.

Now, with our new DataPoint object, we can map data we receive to specific datasets. So your implementation I outlined above would be replaced with:

Map<DataPoint, DescriptiveStatistics> map = new HashMap<DataPoint, DescriptiveStatistics>();

while (rset.next())
{
    // Get parameters we differentiate based on.
    String truck = rset.getString("Ambulance");
    int year = rset.getInt("YearOfCall");

    // Create the data point.
    DataPoint point = new DataPoint(truck, year);

     // Get data set for point; if it doesn't exist, create it. 
    if (map.get(point) == null) {
        map.put(new DescriptiveStatistics());
    }
    DescriptiveStatistics ds = map.get(point);

    // Add the data of interest to the given data set.
    int responseTime = rset.getInt("ResponseTime");
    ds.addValue(responseTime);
}

When the while loop finishes, you will have a Map filled with a mapping of specific data points and their associated data sets. From there just iterate through the map entries and you can do whatever you want with the datasets:

for (Entry<DataPoint, DescriptiveStatistics> entry : map.entrySet())
...

Hope that clarifies a little.

Ironcache
  • 1,719
  • 21
  • 33