0

I have a lot of data in R for sports teams and their starting line-ups for matches. An example of my dataset is below:

matchdata <- data.frame(match_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2), player_name = c("andrew", "david", "james", "steve", "tim", "dan",
"john", "phil", "matthew", "simon", "ben", "brian", "evan", "tony", "will",
"alister", "archie", "paul", "peter", "warren"), played_for = c("team a", "team a",
"team a", "team a", "team a", "team b", "team b", "team b", "team b", "team b",
"team c", "team c", "team c", "team c", "team c", "team d", "team d", "team d",
"team d", "team d"), played_against = c("team b", "team b", "team b", "team b",
"team b", "team a", "team a", "team a", "team a", "team a", "team d", "team d",
"team d", "team d", "team d", "team c", "team c", "team c", "team c", "team c"),
score_for = c(2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0),
score_against = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3))

What I am trying to achieve is to create a separate entry for each 'player vs player' match-up on each matchday. I want my output to look something like:

output <- data.frame(match_id = 1, player_name = "andrew", played_against = c("dan",
"john", "phil", "matthew", "simon"), score_for = 2, score_against = 1)

So, rather than each player playing against each team on that day, I can analyse and compare performances on a one-v-one basis.

EDIT: I only want to compare players with the players on the OPPOSING team. Also I only need to compare players with players from the team they faced ON THAT MATCH_ID. So, in this example, each player would have 5 lines of entries (1 for each player on the team that they played AGAINST in that particular matchup)

Can anyone help me with the best way to achieve this please? I have some experience using reshape or melt but cannot get it to produce what I want in this instance.

Can anyone recommend the best way to achieve what I need please?

user3198404
  • 127
  • 11
  • Do I understand correct that even though you display a 1-to-1 relationship, your data (score_for, score_against) remains team-vs-team, right? There are no player-specific data to compare, it seems. – talat Jul 28 '14 at 16:11
  • Hi, yes the score is the team score. I want to have a separate entry for each player-v-player 'matchup' but then the score to relate to the final score of the match between the teams. – user3198404 Jul 28 '14 at 16:15
  • In terms of what I am trying to achieve I want to experiment with some player contribution scoring systems such as those used by on-line gaming e.g. ELO / Trueskill – user3198404 Jul 28 '14 at 16:15
  • Do you want to compare each player to each other player, even if they are on the same team (like andrew and tim)? If not, please edit your question to make that clear. – talat Jul 28 '14 at 16:40
  • Hi, yes you are correct I only want to compare players to players on the opposing team. I have edited my original post now – user3198404 Jul 28 '14 at 17:10

1 Answers1

1

Maybe you're looking for something like this?

md <- matchdata[c('match_id', 'player_name', 'played_for', 'score_for', 'score_against')]
player.combos <- with(matchdata, expand.grid(player_name=player_name, played_against=player_name))
player.combos.teams <- merge(player.combos, md, by.x='played_against', by.y='player_name')[c('player_name', 'played_against', 'played_for')]
subset(merge(md, player.combos.teams, by='player_name'), 
    played_for.x != played_for.y, select=c('match_id', 'player_name', 'played_against', 'score_for', 'score_against'))

# HEAD:
# 
#   match_id player_name played_against score_for score_against
# 2        1      andrew           john         2             1
# 6        1      andrew          simon         2             1
# 7        1      andrew            dan         2             1
# 8        1      andrew        matthew         2             1
# 9        1      andrew           phil         2             1
# 
#   ---  40  rows omitted ---
# 
# TAIL:
#     match_id player_name played_against score_for score_against
# 91         1         tim          simon         2             1
# 95         1         tim           john         2             1
# 96         1         tim            dan         2             1
# 99         1         tim        matthew         2             1
# 100        1         tim           phil         2             1
Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113
  • Hi, this looks great, exactly what I was looking for. Thank you both for the swift replies – user3198404 Jul 28 '14 at 17:21
  • Hi, I have had chance to come back to this now and think that although this solution was exactly what I asked for in my original question it isn't quite what I am trying to achieve in the project. Hopefully there is a tweak to the original answer that can do this. – user3198404 Aug 05 '14 at 11:22
  • In the original data I only presented team lineups from a single match_id. The problem I am encountering is when I try to extend to multiple match id's. I only want to create player-v-player matchups for within single match_ids – user3198404 Aug 05 '14 at 11:23
  • NOTE: I have edited my original data now to incorporate two match_ids – user3198404 Aug 05 '14 at 11:29
  • 1
    ... subset the data.frame by each match.id, then do this on each subset – Matthew Plourde Aug 05 '14 at 12:23
  • Hi Matthew, I will eventually have hundreds (maybe thousands) of individual matches I want to run this on at the same time. Is there a way I can get this to run on all of the match id's in one action? Apologies if this part is actually straightforward but I am still a bit of an R novice :-) – user3198404 Aug 05 '14 at 12:30
  • 1
    yes, use `split`, which will give you back a list. Then pack this code into a function, and supply the split data.frame and the function to `lapply`. – Matthew Plourde Aug 05 '14 at 12:40
  • Thanks, I shall try that :-) – user3198404 Aug 05 '14 at 12:41
  • and you can collapse the final result back into a data.frame with `do.call(rbind, final_result_list)`. – Matthew Plourde Aug 05 '14 at 12:52
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/58709/discussion-between-user3198404-and-matthew-plourde). – user3198404 Aug 05 '14 at 15:33