0

I am downloading data from census.gov using R library tidycensus data. then i am transforming data using spread(). Each geoid has many columns with estimate value, but it is producing NA for rest of the columns.

actual data

data after applying spread function

Please help me to correct the data.

Dput:

structure(list(GEOID = c(13001950100, 13001950100, 13001950100, 
13001950100, 13001950100, 13001950100), NAME = c("Census Tract 9501, Appling County, Georgia", 
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia", 
"Census Tract 9501, Appling County, Georgia", "Census Tract 9501, Appling County, Georgia", 
"Census Tract 9501, Appling County, Georgia"), variable = c("S2401_C01_001", 
"S2401_C01_002", "S2401_C01_003", "S2401_C01_004", "S2401_C01_005", 
"S2401_C01_006"), estimate = c(1406, 271, 54, 54, 0, 0), moe = c(214, 
87, 43, 43, 13, 13)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))
Matt
  • 7,255
  • 2
  • 12
  • 34

2 Answers2

0

If you want each ID to be in one row:

library(tidyverse)     
df <- df %>%
     pivot_wider(names_from = variable, values_from = c("estimate", "moe"))
Matt
  • 7,255
  • 2
  • 12
  • 34
  • Awesome. If this solved your question, feel free to click on the checkmark on the left to show that it's accepted. – Matt Apr 02 '20 at 18:23
0

An option with dcast

library(data.table)
dcast(setDT(df), GEOID + NAME ~ variable, value.var = c("estimate", "moe"))
akrun
  • 874,273
  • 37
  • 540
  • 662