-1

I am trying to reshape a data frame from a long vector to a table with the date being first column (ideally its the index as I want to make this an xts format ultimately) and the place_id's being new columns extending outward from left to right. The values contained for the place_id and corresponding date would be populating the table "index_nsa".

place_id index_nsa New_Date
1   DV_ENC    100.00   1991-1
2   DV_ENC    100.99   1991-2
3   DV_ENC    101.36   1991-3
4   DV_ENC    101.75   1991-4
5   DV_ENC    102.39   1991-5
6   DV_ENC    102.81   1991-6


dput: 
structure(list(place_id = structure(c(414L, 414L, 414L, 414L, 
414L, 414L), .Label = c("10180", "10420", "10500", "10540", "10580", 
"10740", "10780", "10900", "11020", "11100", "11180", "11244", 
"11260", "11460", "11500", "11540", "11700", "12020", "12060", 
"12100", "12220", "12260", "12420", "12540", "12580", "12620", 
"12700", "12940", "12980", "13020", "13140", "13220", "13380", 
"13460", "13740", "13780", "13820", "13900", "13980", "14010", 
"14020", "14100", "14260", "14454", "14500", "14540", "14740", 
"14860", "15180", "15260", "15380", "15500", "15540", "15680", 
"15764", "15804", "15940", "15980", "16020", "16060", "16180", 
"16220", "16300", "16540", "16580", "16620", "16700", "16740", 
"16820", "16860", "16940", "16984", "17020", "17140", "17300", 
"17420", "17460", "17660", "17780", "17820", "17860", "17900", 
"17980", "18020", "18140", "18580", "18700", "18880", "19060", 
"19124", "19140", "19180", "19300", "19340", "19430", "19460", 
"19500", "19660", "19740", "19780", "19804", "20020", "20100", 
"20220", "20260", "20500", "20700", "20740", "20940", "20994", 
"21060", "21140", "21300", "21340", "21420", "21500", "21660", 
"21780", "21820", "22020", "22140", "22180", "22220", "22380", 
"22420", "22500", "22520", "22540", "22660", "22744", "22900", 
"23060", "23104", "23224", "23420", "23460", "23540", "23580", 
"23844", "23900", "24020", "24140", "24220", "24260", "24300", 
"24340", "24420", "24500", "24540", "24580", "24660", "24780", 
"24860", "25060", "25180", "25220", "25260", "25420", "25500", 
"25540", "25620", "25860", "25940", "25980", "26140", "26300", 
"26380", "26420", "26580", "26620", "26820", "26900", "26980", 
"27060", "27100", "27140", "27180", "27260", "27340", "27500", 
"27620", "27740", "27780", "27860", "27900", "27980", "28020", 
"28100", "28140", "28420", "28660", "28700", "28740", "28940", 
"29020", "29100", "29180", "29200", "29340", "29404", "29420", 
"29460", "29540", "29620", "29700", "29740", "29820", "29940", 
"30020", "30140", "30300", "30340", "30460", "30620", "30700", 
"30780", "30860", "30980", "31020", "31084", "31140", "31180", 
"31340", "31420", "31460", "31540", "31700", "31740", "31860", 
"31900", "32580", "32780", "32820", "32900", "33124", "33140", 
"33220", "33260", "33340", "33460", "33540", "33660", "33700", 
"33740", "33780", "33860", "33874", "34060", "34100", "34580", 
"34620", "34740", "34820", "34900", "34940", "34980", "35004", 
"35084", "35100", "35154", "35300", "35380", "35614", "35660", 
"35840", "35980", "36084", "36100", "36140", "36220", "36260", 
"36420", "36500", "36540", "36740", "36780", "36980", "37100", 
"37340", "37460", "37620", "37860", "37900", "37964", "38060", 
"38220", "38300", "38340", "38540", "38860", "38900", "38940", 
"39100", "39150", "39300", "39340", "39380", "39460", "39540", 
"39580", "39660", "39740", "39820", "39900", "40060", "40140", 
"40220", "40340", "40380", "40420", "40484", "40580", "40660", 
"40900", "40980", "41060", "41100", "41140", "41180", "41420", 
"41500", "41540", "41620", "41660", "41700", "41740", "41884", 
"41940", "42020", "42034", "42100", "42140", "42200", "42220", 
"42340", "42540", "42644", "42680", "42700", "43100", "43300", 
"43340", "43420", "43580", "43620", "43780", "43900", "44060", 
"44100", "44140", "44180", "44220", "44300", "44420", "44700", 
"44940", "45060", "45104", "45220", "45300", "45460", "45500", 
"45540", "45780", "45820", "45940", "46060", "46140", "46220", 
"46300", "46340", "46520", "46540", "46660", "46700", "47020", 
"47220", "47260", "47300", "47380", "47460", "47580", "47664", 
"47894", "47940", "48060", "48140", "48260", "48300", "48424", 
"48540", "48620", "48660", "48700", "48864", "48900", "49020", 
"49180", "49340", "49420", "49620", "49660", "49700", "49740", 
"AK", "AL", "AR", "AZ", "CA", "CO", "CT", "DC", "DE", "DV_ENC", 
"DV_ESC", "DV_MA", "DV_MT", "DV_NE", "DV_PAC", "DV_SA", "DV_WNC", 
"DV_WSC", "FL", "GA", "HI", "IA", "ID", "IL", "IN", "KS", "KY", 
"LA", "MA", "MD", "ME", "MI", "MN", "MO", "MS", "MT", "NC", "ND", 
"NE", "NH", "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "PR", 
"RI", "SC", "SD", "TN", "TX", "USA", "UT", "VA", "VT", "WA", 
"WI", "WV", "WY"), class = "factor"), index_nsa = c(100, 100.99, 
101.36, 101.75, 102.39, 102.81), New_Date = structure(c(65L, 
69L, 70L, 71L, 72L, 73L), .Label = c("1975-1", "1975-2", "1975-3", 
"1975-4", "1976-1", "1976-2", "1976-3", "1976-4", "1977-1", "1977-2", 
"1977-3", "1977-4", "1978-1", "1978-2", "1978-3", "1978-4", "1979-1", 
"1979-2", "1979-3", "1979-4", "1980-1", "1980-2", "1980-3", "1980-4", 
"1981-1", "1981-2", "1981-3", "1981-4", "1982-1", "1982-2", "1982-3", 
"1982-4", "1983-1", "1983-2", "1983-3", "1983-4", "1984-1", "1984-2", 
"1984-3", "1984-4", "1985-1", "1985-2", "1985-3", "1985-4", "1986-1", 
"1986-2", "1986-3", "1986-4", "1987-1", "1987-2", "1987-3", "1987-4", 
"1988-1", "1988-2", "1988-3", "1988-4", "1989-1", "1989-2", "1989-3", 
"1989-4", "1990-1", "1990-2", "1990-3", "1990-4", "1991-1", "1991-10", 
"1991-11", "1991-12", "1991-2", "1991-3", "1991-4", "1991-5", 
"1991-6", "1991-7", "1991-8", "1991-9", "1992-1", "1992-10", 
"1992-11", "1992-12", "1992-2", "1992-3", "1992-4", "1992-5", 
"1992-6", "1992-7", "1992-8", "1992-9", "1993-1", "1993-10", 
"1993-11", "1993-12", "1993-2", "1993-3", "1993-4", "1993-5", 
"1993-6", "1993-7", "1993-8", "1993-9", "1994-1", "1994-10", 
"1994-11", "1994-12", "1994-2", "1994-3", "1994-4", "1994-5", 
"1994-6", "1994-7", "1994-8", "1994-9", "1995-1", "1995-10", 
"1995-11", "1995-12", "1995-2", "1995-3", "1995-4", "1995-5", 
"1995-6", "1995-7", "1995-8", "1995-9", "1996-1", "1996-10", 
"1996-11", "1996-12", "1996-2", "1996-3", "1996-4", "1996-5", 
"1996-6", "1996-7", "1996-8", "1996-9", "1997-1", "1997-10", 
"1997-11", "1997-12", "1997-2", "1997-3", "1997-4", "1997-5", 
"1997-6", "1997-7", "1997-8", "1997-9", "1998-1", "1998-10", 
"1998-11", "1998-12", "1998-2", "1998-3", "1998-4", "1998-5", 
"1998-6", "1998-7", "1998-8", "1998-9", "1999-1", "1999-10", 
"1999-11", "1999-12", "1999-2", "1999-3", "1999-4", "1999-5", 
"1999-6", "1999-7", "1999-8", "1999-9", "2000-1", "2000-10", 
"2000-11", "2000-12", "2000-2", "2000-3", "2000-4", "2000-5", 
"2000-6", "2000-7", "2000-8", "2000-9", "2001-1", "2001-10", 
"2001-11", "2001-12", "2001-2", "2001-3", "2001-4", "2001-5", 
"2001-6", "2001-7", "2001-8", "2001-9", "2002-1", "2002-10", 
"2002-11", "2002-12", "2002-2", "2002-3", "2002-4", "2002-5", 
"2002-6", "2002-7", "2002-8", "2002-9", "2003-1", "2003-10", 
"2003-11", "2003-12", "2003-2", "2003-3", "2003-4", "2003-5", 
"2003-6", "2003-7", "2003-8", "2003-9", "2004-1", "2004-10", 
"2004-11", "2004-12", "2004-2", "2004-3", "2004-4", "2004-5", 
"2004-6", "2004-7", "2004-8", "2004-9", "2005-1", "2005-10", 
"2005-11", "2005-12", "2005-2", "2005-3", "2005-4", "2005-5", 
"2005-6", "2005-7", "2005-8", "2005-9", "2006-1", "2006-10", 
"2006-11", "2006-12", "2006-2", "2006-3", "2006-4", "2006-5", 
"2006-6", "2006-7", "2006-8", "2006-9", "2007-1", "2007-10", 
"2007-11", "2007-12", "2007-2", "2007-3", "2007-4", "2007-5", 
"2007-6", "2007-7", "2007-8", "2007-9", "2008-1", "2008-10", 
"2008-11", "2008-12", "2008-2", "2008-3", "2008-4", "2008-5", 
"2008-6", "2008-7", "2008-8", "2008-9", "2009-1", "2009-10", 
"2009-11", "2009-12", "2009-2", "2009-3", "2009-4", "2009-5", 
"2009-6", "2009-7", "2009-8", "2009-9", "2010-1", "2010-10", 
"2010-11", "2010-12", "2010-2", "2010-3", "2010-4", "2010-5", 
"2010-6", "2010-7", "2010-8", "2010-9", "2011-1", "2011-10", 
"2011-11", "2011-12", "2011-2", "2011-3", "2011-4", "2011-5", 
"2011-6", "2011-7", "2011-8", "2011-9", "2012-1", "2012-10", 
"2012-11", "2012-12", "2012-2", "2012-3", "2012-4", "2012-5", 
"2012-6", "2012-7", "2012-8", "2012-9", "2013-1", "2013-10", 
"2013-11", "2013-12", "2013-2", "2013-3", "2013-4", "2013-5", 
"2013-6", "2013-7", "2013-8", "2013-9", "2014-1", "2014-10", 
"2014-11", "2014-12", "2014-2", "2014-3", "2014-4", "2014-5", 
"2014-6", "2014-7", "2014-8", "2014-9", "2015-1", "2015-10", 
"2015-11", "2015-12", "2015-2", "2015-3", "2015-4", "2015-5", 
"2015-6", "2015-7", "2015-8", "2015-9", "2016-1", "2016-10", 
"2016-11", "2016-12", "2016-2", "2016-3", "2016-4", "2016-5", 
"2016-6", "2016-7", "2016-8", "2016-9", "2017-1", "2017-10", 
"2017-11", "2017-12", "2017-2", "2017-3", "2017-4", "2017-5", 
"2017-6", "2017-7", "2017-8", "2017-9", "2018-1", "2018-10", 
"2018-11", "2018-12", "2018-2", "2018-3", "2018-4", "2018-5", 
"2018-6", "2018-7", "2018-8", "2018-9", "2019-1", "2019-2", "2019-3", 
"2019-4", "2019-5", "2019-6", "2019-7", "2019-8"), class = "factor")), row.names = c(NA, 
6L), class = "data.frame")

I've tried tidyr such as spread(data7, key= "New_Date", value= "index_nsa") spread(data7, key= "place_id", value= "index_nsa")

However, neither instance produces the desired result but it does give me a number of errors. There are a few instances where the index_nsa has blanks for certain dates and id's but that's fine, I'd just want them to be NA or 0 in those cases.

Any help to solve this would be appreciated!

phiver
  • 23,048
  • 14
  • 44
  • 56
js80
  • 385
  • 2
  • 11
  • Please show data as output from dput(X) command. Do not use images as no one can use those without retyping. See instructions at top of [tag:r] tag page. – G. Grothendieck Nov 14 '19 at 14:37
  • How do I do this? I tried the dput command on the data frame but it's quite large. – js80 Nov 14 '19 at 14:40
  • Create a *minimal* example that still illustrates the problem rather than showining the original data. `head`or ordinary subsetting may be useful for that. – G. Grothendieck Nov 14 '19 at 14:47
  • Is that any easier? That's the first 5 lines. – js80 Nov 14 '19 at 14:49
  • The output using dput just for those 5 is quite large still... – js80 Nov 14 '19 at 14:50
  • Does this work? `pivot_wider(data7, id_cols = New_Date, names_from = place_id, values_from = index_nsa)` –  Nov 14 '19 at 15:04
  • I am having trouble using that call with my version of tidyr/tidyverse. I installed the devtools but its not showing me pivot_wider – js80 Nov 14 '19 at 15:15
  • Adam, your approach worked to create the columns but the values contained in each are either "NULL" or "". Also, the date index repeats itself which is odd. Any idea what's going on? – js80 Nov 14 '19 at 15:49

1 Answers1

0

First we improve the minimal example giving data8 which has 2 id's (we call the second one X) since there can be special issues with one id. We drop the unused levels in place_id. Now use read.zoo . Its split argument will cause the data frame to be split by that column and its entries will be shown as the column names.

library(xts)

# create an example with two id's
data8 <- rbind(data7, transform(data7, place_id = "X"))
data8 <- transform(data8, place_id = droplevels(place_id))

z <- read.zoo(data8, split = 1, index = 3, FUN = as.yearmon)
x <- as.xts(z)
x

giving:

         DV_ENC      X
Jan 1991 100.00 100.00
Feb 1991 100.99 100.99
Mar 1991 101.36 101.36
Apr 1991 101.75 101.75
May 1991 102.39 102.39
Jun 1991 102.81 102.81
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • The issue is I need to keep the place ID because there are about 400 total in this data set and I want them to expand wider adding a new column for each new place_id while keeping the date index the same. – js80 Nov 14 '19 at 15:27
  • I have improved the example to have two id's so we can see that the id's are shown as the column names. (If were only one Id you can special case it.) – G. Grothendieck Nov 14 '19 at 15:38
  • This is the error I receive: z <- read.zoo(data8, split = 1, index = 3, FUN = as.yearmon) Error in merge.zoo(`10180` = c(107.69, 94.76, 101.03, 100.59, 93.88, 90.91, : series cannot be merged with non-unique index entries in a series In addition: There were 50 or more warnings (use warnings() to see the first 50) – js80 Nov 14 '19 at 15:45
  • Please cut your example down to something small but shows the error and put it in your question. When you do this please make sure that your data does not have unused levels so it is not so large. – G. Grothendieck Nov 14 '19 at 16:11
  • You have it working right for transforming a small piece of the vector. The issue comes up when the next place ID comes up and a new column with its index_nsa values needs to be put into it with the same date index. How large a sample should I include here? – js80 Nov 14 '19 at 16:15
  • And thank you for your help. I greatly appreciate it. – js80 Nov 14 '19 at 16:15
  • The issue comes up when the next unique place_id shows up in the vector. The number of months for each place_id is 344.... Jan 1991 to August 2019. – js80 Nov 14 '19 at 16:20
  • Make the smallest example that illustrates the problem. You may have to put in substantial work to get it down but there is no other way to really know what the problem is. – G. Grothendieck Nov 14 '19 at 16:20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/202353/discussion-between-js80-and-g-grothendieck). – js80 Nov 14 '19 at 16:24