First time posting, mainly because I got tired of banging my head against the wall.
Thanks in advance for looking at this.
I have a data frame that looks like this:
state city x y z
1 OR Portland 8 10 1
2 OR Portland 8 10 4
3 OR Portland 8 10 10
4 NY New York 29 15 10
5 NY New York 29 15 18
6 NJ Trenton 8 10 50
7 NJ Trenton 8 10 60
8 NJ Trenton 8 10 70
9 WA Seattle 1 70 6
10 WA Seattle 1 70 7
11 WA Seattle 1 70 8
12 WA Seattle 1 70 9
13 WA Seattle 1 70 10
14 WA Seattle 1 70 11
I have been trying to reshape it to look like this:
state city x y z.1 z.2 z.3 z.4 z.5 z.6
OR Portland 8 10 1 4 10
NY New York 29 15 10 18
NJ Trenton 8 10 50 60 70
WA Seattle 1 70 6 7 8 9 10 11
I have been using the package reshape2 and the code looks like this:
df <- melt(data,id.vars = c("state","city","x","y"),measure.vars = "z")
wide <- dcast(df, state + city + x + y ~ variable)
Which returns a count of variable z for each set of id.vars.
I also tried this:
wide <- dcast(df, state + city + x + y ~ value)
Which looks like this:
state city x y 1 4 6 7 etc...
OR Portland 8 10 1 1 0 0
NY New York 29 15 0 0 0 0
NJ Trenton 8 10 0 0 0 0
WA Seattle 1 70 0 0 1 1
This is closer to what I'm looking for but would be very difficult to use for looking up information.
Tell me if I'm wrong, but it looks like I need an id variable for each duplicate value of state, city, x, y.
I haven't been able to think up or find anything that will allow me to create column that will number duplicate values like below.
state city x y z num
1 OR Portland 8 10 1 1
2 OR Portland 8 10 4 2
3 OR Portland 8 10 10 3
4 NY New York 29 15 10 1
5 NY New York 29 15 18 2
6 NJ Trenton 8 10 50 1
7 NJ Trenton 8 10 60 2
8 NJ Trenton 8 10 70 3
9 WA Seattle 1 70 6 1
10 WA Seattle 1 70 7 2
11 WA Seattle 1 70 8 3
12 WA Seattle 1 70 9 4
13 WA Seattle 1 70 10 5
14 WA Seattle 1 70 11 6
I would appreciate any help or an idea of where to keep looking for solutions.
Best,
-n