I'm looking into using a CASS-Certified address validation service to correct user-provided street addresses at the time of entry. (Specifically, I'm looking at SmartyStreets' LiveAddress.) However, USPS dictates that a correct address must be in all caps, so CASS services almost uniformly return addresses that way. When mailing to the client at that address, though, it would be preferable to use a more humane, conventional casing.
The question, of course, is how to make that happen. I know there's no such thing as a perfect solution that doesn't involve an complete nation-wide database of correctly capitalized street and city names. A set of passable heuristics might be good enough, though, since we will probably be kicking the corrected address back to the user, ultimately leaving it up to them.
A short list of problems that I was able to come up with after a few minutes of thought:
SW FIRST ST
should beSW First St
, notSw First St
.MCDOUGLE ST
should beMcDougle St
, notMcdougle St
.MACDOUGLE ST
should probably beMacdougle St
rather thanMacDougle St
, sinceMacoroni Rd
should usually not beMacOroni Rd
.1ST ST
should be1st St
, not1St St
.- Not knowing if a street name is based on a surname, we can possibly not safely make
VAN
intovan
, butVON
can probably becomevon
.
Are there any existing libraries that could at least get me started? Addresses are complicated and fickle things, and I'd rather not home-brew the whole thing if I don't have to. I'm using C#, but I'm open to porting code from another language.
Barring that, does anyone have a decent reference of common capitalization exceptions for street and/or city names?