0

I have often ran into a situation where I have a many to one relationship in a database (for a contrived example, say players to team; where a team has many players, but a player only plays for a single team at a given time) and I am looking to run a query which gets information from both tables in order to populate a java object, where the object representing the "one" side of the relationship has a list of the "many" objects. In order to do this, I have seen two different approaches, which I explain below. Which do you prefer, and why? Please be specific and cite clear (and measurable) pros/cons of each approach, and feel free to offer other approaches that are not listed.

--SQL: just a quick example to show my point, syntax may not be perfect
CREATE TABLE TEAM(
  TEAM_SID SMALLINT NOT NULL AUTO_INCREMEMENT, 
  NAME VARCHAR(16)
  --Other fields omitted for brevity
  PRIMARY KEY(TEAM_SID)
);

CREATE TABLE PLAYER(
  PLAYER_SID INT NOT NULL AUTO_INCREMENT,
  TEAM_SID INT NOT NULL,
  AGE TINYINT NOT NULL,
  --Other fields omitted for brevity
  PRIMARY KEY(PLAYER_SID),
  INDEX PLAYER_X01 (TEAM_SID),
  FOREIGN KEY (TEAM_SID)
     REFERENCES TEAM(TEAM_SID)
     ON DELETE SET NULL
);



//Java model classes: just a quick example to show my point, syntax may not be perfect
class Team{
  private int teamSid;
  private String name;
  private List<Player> playersCurrentlyOnTeam;
  //getters and setters ommitted for brevity
}

class Player{
  private int playerSid;
  private int age;
  private int teamSid;
 //getters and setters ommitted for brevity
}

Option 1: Run a single query, (Inner) joining the two tables (recognizing that any time a team has more than one player, all data from the team table will be duplicated), then in the java code, loop over the result set and populate the objects. Pros:

  • one database call (and hence one round trip over the network)

Cons:

  • more memory consumption (on both database and java side)
  • more memory consumption (on both database and java side)

Example:

--SQL query ran (for this only)
SELECT T.TEAM_SID, T.NAME, P.PLAYER_SID, P.AGE
FROM TEAM T
JOIN PLAYER P
ON T.TEAM_SID = P.TEAM_SID
WHERE T.TEAM_SID = :TEAM_SID

//just a quick example to show my point, syntax may not be perfect (typed it as I go here, so give me some slack, you get the idea)
class Dao

Optional<Team> getTeamById(int teamId){
//logging and other code ommitted for brevity
Team teamExtractedFromRs = namedParameterJdbcTemplate.query(sqlString, parametersForQuery, (ResultSet rs) -> {
    Map<Integer, Team> teamsKeyedToTeamId = new HashMap<>();
       while(rs.next()){
          int teamIdForThisRow = rs.getInt("TEAM_SID");
          Team teamForId = teamsKeyedToTeamId.putIfAbsent(teamIdForThisRow, teamId -> { 
            Team newTeamForThisId = new Team();
            newTeamForThisId.setTeamId(teamIdForThisRow);
            newTeamForThisId.setName(rs.getString("NAME"));
            newTeamForThisId.setPlayers(Arrays.asList());
            return newTeamForThisId;
           });
         Player playerForCurrentRow = new Player();
         playerForCurrentRow.setPlayerId(rs.getInt("PLAYER_SID"));
         playerForCurrentRow.setTeamId(teamIdForThisRow);
         teamForId.getPlayers().add(playerForCurrentRow);
       }
       return teamsKeyedToTeamId.values();
  });
  return Optional.ofNullable(teamExtractedFromRs);
}

Option 2: make a database call to retrieve the team specific information and populate it into the team java object. Next, make a second database call to retrieve all players for a given team, and set them into the list on the java object. Pros:

  • simple logic (as each row only comes from one table)

Cons:

  • two database calls (as opposed to the one above)

Example:

--SQL QUERY ONE, RAN FIRST  (referenced as sqlStringForTeamQuery in java below)
SELECT T.TEAM_SID, T.NAME
FROM TEAM T
WHERE TEAM_SID = :TEAM_SID

--SQL QUERY TWO, (POTENTIALLY) RAN SECOND (referenced as sqlStringForPlayerQuery in java below)
SELECT P.PLAYER_SID, P.AGE
FROM PLAYER P
WHERE P.TEAM_SID = :TEAM_SID

class dao {
  Optional<Team> getTeamById(int teamId){
    //logging and other code ommitted for brevity
    Team teamFromRow = namedParameterJdbcTemplate.query(sqlStringForTeamQuery, parametersForQuery, (ResultSet rs, int rowNumber) -> {
       int teamIdForThisRow = rs.getInt("TEAM_SID");
       Team newTeamForThisId = new Team();
       newTeamForThisId.setTeamId(teamIdForThisRow);
       newTeamForThisId.setName(rs.getString("NAME"));
       newTeamForThisId.setPlayers(Arrays.asList());
     }
  });
  if(teamFromRow != null){
     List<Player> playersCurrentlyOnTeam = namedParameterJdbcTemplate.query(sqlStringForPlayerQuery, parametersForQuery, (ResultSet rs, int rowNumber) -> {
        Player player = new Player();
        player.setPlayerId(rs.getInt("PLAYER_SID"));
        player.setTeamId(rs.getInt("TEAM_SID"));
        return player;
     });
   teamFromRow.setPlayers(playersCurrentlyOnTeam);
  }

  return Optional.ofNullable(teamFromRow);

}

Which do you prefer, and why?

Connor Butch
  • 648
  • 1
  • 10
  • 28
  • 1
    Ask this in different domain ```https://codereview.stackexchange.com/``` – Uday Chauhan May 14 '20 at 03:28
  • If the alternatives are "one query"/"two queries" then **maybe** the latter variant is more suitable. But if the alternatives are "one query"/"more than one (maybe a lot) queries" then the former variant only. – Akina May 14 '20 at 04:35
  • In general, the fewer round trips to the database the better, but we don't need to cite anything. You can add a thousand teams and ten thousand players and test this for yourself. – Strawberry May 14 '20 at 06:42

0 Answers0