Let's say you have a data set that looks like this:
Vietnam Gulf War Iraq War
veteran1 1 0 0
veteran2 0 1 0
veteran3 0 0 1
veteran4 0 1 1 # <---- Note this row
You want to consolidate these columns without affecting other columns in the dataframe like so:
Service
veteran1 1
veteran2 2
veteran3 3
veteran4 2 # <---- Note this row
Where
1 = Vietnam
,2 = Gulf War
,3 = Iraq War
- If a veteran has served 2 or more it should pick only one (as is the case with
veteran4
where it picked their left-most column) - there are many other columns in the dataframe, and they shouldn't be affected by any of this
Questions:
How would you do this in R
?
(Note: if it's easier to do in some other free open source program, please feel free to share which program and how you would do it. This is a massive dataset: 3 million rows, the American Community Survey.)