I think the title speaks for itself.
But, to give an example: In a recent post, 37 signals show it's real downtime and compares with other web services. They get very few down time and probably most companies don't have that. But, to measure all that you would need a bullet prof system with 100% uptime, or at least some kind of heuristics to simulate that. In this case they use Pingdom, but any other similar service should be capable to emulate that.
So, how do they do that? Do they leave 2 or 3 servers crawling data and do and average, not considering their own downtime? Is it trivial or complex?
Ps.: A better definition for "precision" would be measuring without mistakes, or without missing any downtime. So if the service is down you know, 100% of the time. Otherwise you could have a biased measure.