-1

I'm trying to clean some data for some soccer analysis I wanted to do. My problem is that I can't figure out how to properly change my dataframe. Without a picture, there's really nothing I can do to properly explain it.

Here's a screenshot of my data.

https://i.stack.imgur.com/5pAyl.jpg

I would like the data to look like this

Columns: Team XG Offense XG Defense

Row A: Team A Associated Measure.Value Associated Measure.Value

ROW B: ... ... ...

Emily Kothe
  • 842
  • 1
  • 6
  • 17

1 Answers1

1

Create the dataframe for demo purposes

df <- structure(
  list(
    Measure.Names = structure(
      c(1L, 1L, 1L, 1L, 2L,
        2L, 2L, 2L),
      .Label = c("Over Perform XG - Defense", "Over Perform XG - Offense"),
      class = "factor"
    ),
    Team = structure(
      c(4L, 3L, 2L, 1L, 4L,
        3L, 2L, 1L),
      .Label = c(
        "Bournemouth",
        "Brighton and Hove Albion",
        "Burnley",
        "Cardiff City"
      ),
      class = "factor"
    ),
    Measure.Values = structure(
      1:8,
      .Label = c("1",
                 "2", "3", "4", "5", "6", "7", "8"),
      class = "factor"
    )
  ),
  class = "data.frame",
  row.names = c(NA,-8L)
)

df
#>               Measure.Names                     Team Measure.Values
#> 1 Over Perform XG - Defense             Cardiff City              1
#> 2 Over Perform XG - Defense                  Burnley              2
#> 3 Over Perform XG - Defense Brighton and Hove Albion              3
#> 4 Over Perform XG - Defense              Bournemouth              4
#> 5 Over Perform XG - Offense             Cardiff City              5
#> 6 Over Perform XG - Offense                  Burnley              6
#> 7 Over Perform XG - Offense Brighton and Hove Albion              7
#> 8 Over Perform XG - Offense              Bournemouth              8

User tidyr::spread to transpose

tidyr::spread(df, Measure.Names, Measure.Values)
#>                       Team Over Perform XG - Defense
#> 1              Bournemouth                         4
#> 2 Brighton and Hove Albion                         3
#> 3                  Burnley                         2
#> 4             Cardiff City                         1
#>   Over Perform XG - Offense
#> 1                         8
#> 2                         7
#> 3                         6
#> 4                         5

Created on 2019-02-06 by the reprex package (v0.2.0).

Emily Kothe
  • 842
  • 1
  • 6
  • 17
  • I really appreciate the drawn out answer, but I won't be able to put in each associated value by hand for this portion. I need to automatically put in the associated values that are already in my dataframe: Measure.Values = structure( 1:8, .Label = c("1", "2", "3", "4", "5", "6", "7", "8"), class = "factor" ) . – Arian Modarres Feb 05 '19 at 20:51
  • You only need to run `tidyr::spread(df, Measure.Names, Measure.Values)` on your data, I created the demo based on fake numbers because you provided a screenshot and I needed a minimum reprex. – Emily Kothe Feb 05 '19 at 20:52
  • Wow. Thank you so much. – Arian Modarres Feb 05 '19 at 20:55
  • @ArianModarres if this solves your question you should accept it as the answer. – Emily Kothe Feb 05 '19 at 20:57