Not so -- in Stata (and I would be surprised at a problem in R, but others must speak to that).
Missing observations -- in this context and any similar better called absent -- are not a problem. Here's a demonstration. merge
is smart enough to notice gaps and make them explicit as missings. You could "fix" them yourself ahead of the merge
, but that is pointless.
clear
input state year y
1 2019 1
1 2020 2
2 2019 3
2 2020 4
end
save tomerge
clear
input state year x
1 2019 42
2 2019 84
end
merge 1:1 state year using tomerge
list
Results
. merge 1:1 state year using tomerge
Result Number of obs
-----------------------------------------
Not matched 2
from master 0 (_merge==1)
from using 2 (_merge==2)
Matched 2 (_merge==3)
-----------------------------------------
.
. list
+----------------------------------------+
| state year x y _merge |
|----------------------------------------|
1. | 1 2019 42 1 Matched (3) |
2. | 2 2019 84 3 Matched (3) |
3. | 1 2020 . 2 Using only (2) |
4. | 2 2020 . 4 Using only (2) |
+----------------------------------------+
Otherwise put, 1:1 as syntax specifies the overall pattern and doesn't rule out 0:1 or 1:0 matches. merge
will actually append
if identifiers don't match at all. You do need the key variables to exist under identical names in both datasets.