2

My data is in Excel. I have several sheets of data where the address is always in the same column on every sheet. Examples of the address formats include:

1155 15th Street NW Suite 600 Washington, DC 20005 US
4600 Emperor Blvd #200 Durham, NC 27703-8577 US
200 Stevens Drive Philadelphia, PA 19113 US
505 City Parkway West Orange, CA 92868 US
550 S Caldwell St, Charlotte, NC 28202-2633 US
1643 NW 136th Ave Ste H200 Sunrise, FL 33323-2857 US

I have tried the code below, but get an error at this point in the code "sCity = Trim(Mid$(rCell.Value, Len(sAddress) + 1, lStatePos - Len(sAddress) - 1))"

Can anyone help me figure out how to resolve this issue?

Sub SplitAddresses()

    Dim vaStates As Variant
    Dim vaStreets As Variant
    Dim i As Long
    Dim rCell As Range
    Dim sAddress As String
    Dim sCity As String, sState As String
    Dim sZip As String
    Dim lStreetPos As Long, lStatePos As Long

    vaStates = Array(“ AL “, “ AK “, “ AZ “, “ AR “, “ CA “, “ CO “, “ CT “, “ DE “, “ DC “, “ FL “, “ GA “, “ HI “, “ ID “, “ IL “, “ IN “, “ IA “, “ KS “, “ KY “, “ LA “, “ ME “, “ MD “, “ MA “, “ MI “, “ MN “, “ MS “, “ MO “, “ MT “, “ NE “, “ NV “, “ NH “, “ NJ “, “ NM “, “ NY “, “ NC “, “ ND “, “ OH “, “ OK “, “ OR “, “ PA “, “ RI “, “ SC “, “ SD “, “ TN “, “ TX “, “ UT “, “ VT “, “ VA “, “ WA “, “ WV “, “ WI “, “ WY “, “ GU “, “ PR “)
    vaStreets = Array(" CR ", " BLVD ", " RD ", " ST ", " AVE ", " CT ")

    For Each rCell In Sheet1.Range("A1:A5").Cells
        sAddress = "": sCity = "": sZip = "": sState = ""
        For i = LBound(vaStreets) To UBound(vaStreets)
            lStreetPos = InStr(1, rCell.Value, vaStreets(i))
            If lStreetPos > 0 Then
                sAddress = Trim(Left$(rCell.Value, lStreetPos + Len(vaStreets(i)) - 1))
                Exit For
            End If
        Next i

        For i = LBound(vaStates) To UBound(vaStates)
            lStatePos = InStr(1, rCell.Value, vaStates(i))
            If lStatePos > 0 Then
                sCity = Trim(Mid$(rCell.Value, Len(sAddress) + 1, lStatePos - Len(sAddress) - 1))
                sState = Trim(Mid$(rCell.Value, lStatePos + 1, Len(vaStates(i)) - 1))
                sZip = Trim(Mid$(rCell.Value, lStatePos + Len(vaStates(i)), Len(rCell.Value)))
                Exit For
            End If
        Next i

        rCell.Offset(0, 1).Value = "'" & sAddress
        rCell.Offset(0, 2).Value = "'" & sCity
        rCell.Offset(0, 3).Value = "'" & sState
        rCell.Offset(0, 4).Value = "'" & sZip

    Next rCell

End Sub

This is the error I get: error_image

Kristan
  • 23
  • 4
  • maybe the error is coming from mid$? shouldnt it just be Mid? or if it is a formula, maybe you want to put those in double quotation mark. EX: "=TRIM (...) -1)) " – victor song Oct 13 '20 at 16:28
  • 1
    Mid$ is a typed function and is fine. You need to change the quotes back to the right type as they should be "" not ““ – QHarr Oct 13 '20 at 16:38
  • 2
    With the examples you have given, and changing the quotes, code runs fine but doesn't look like logic is right. – QHarr Oct 13 '20 at 16:48
  • 1
    This is not something I would try on my own. Rather, I would lean towards a third-party library. There are a number of questions on this site concerning this topic. – Brian M Stafford Oct 13 '20 at 17:45
  • I have posted the error that I get once the quotes have been corrected. – Kristan Oct 13 '20 at 19:35

2 Answers2

1

There are some inconstancies in your splitting logic, not counting that you'd have to compare your uppercase street array also with Ucase() string values.

Good news, however - as you seem to apply a consequent address logic, i.e. grouping city, state + zip around a last colon delimiter, you could try the following code:

Option Explicit             ' declaration head of code module
Enum c                      ' define column constants
    [_Start] = 0
    add1
    City
    State
    Zip
End Enum

Sub SplitAddresses()
With Sheet1
    'define dataset
    Dim lastRow As Long: lastRow = .Range("A" & .Rows.Count).End(xlUp).Row
    Dim rng As Range: Set rng = .Range("A2:A" & lastRow)
    'assign to variant datafield array (provide for 4 columns: Add+City+State+ZIP)
    Dim v: v = rng.Resize(columnsize:=4).Value2
    'split data
    doSplit v
    'write split results to any target, e.g. B:B
    .Range("B2").Resize(UBound(v), 4) = v
End With
End Sub

Help procedure doSplit

Sub doSplit(data)
Dim i As Long
For i = LBound(data) To UBound(data)
    Dim curAddress As String: curAddress = data(i, c.add1)
    
    Dim tokens, tmp
    tokens = Split(curAddress, ",")
    
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'a) analyze string part after last ","
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    tmp = Split(Trim(tokens(UBound(tokens))) & " ", " ", 2)
    'aa) add State + Zip (to columns 3..4)
    data(i, c.State) = tmp(0): data(i, c.Zip) = tmp(1)
    
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'b) analyze first string part
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    tmp = Split(tokens(UBound(tokens) - 1), " ")
    'data(i, c.City) = tmp(UBound(tmp))   '<< only for 1-word city names
     data(i, c.City) = getCity(tmp)       '<< see edit below
    'bb) add City + Address
    data(i, c.add1) = Split(curAddress, data(i, c.City), 2)(0)
    data(i, c.add1) = Replace(data(i, c.add1), ",", "")
Next i
End Sub

Help function // Edit due to @RonRosenfeld's comment

As there will be city names consisting of compound words, the city string assignment in above sub has to be changed from data(r, c.City) = tmp(UBound(tmp)) to

    data(r, c.City) = getCity(tmp)  ' << function call

Function getCity()

Includes checks for common first parts as "North", "West" or "New" to avoid at least to check an exhaustive list with compound city names. All other needed city names with more than one word have to be defined in an additional list cities:

Function getCity(tmp) As String
'Purp.: return valid city names of either one or two parts
'[1]Definitions
    'a) List common first parts of city names like "West" in "West Orange"
        Dim common$: common = "North,West,South,East,Grand,New"
    'b) List all other needed cities consisting of compound words
        Dim cities$: cities = "Sterling Heights,Ann Arbor"
'[2]Get potential city name
    'a) Define tmp indices of potential city tokens
        Dim first&: first = UBound(tmp) - 1
        Dim secnd&: secnd = UBound(tmp)
    'b) Build city name as compound string of tmp tokens
        Dim City As String
        City = Trim(IIf(first < 0, "", tmp(first) & " ") & tmp(secnd))
'[3]Check common first parts plus additional cities list
    'a) Check for common name parts like e.g. "West" in "West Orange"
        If InStr(common & ",", tmp(first) & ",") Then getCity = City: Exit Function
    'b) Check rest in listed cities and return function result
        getCity = IIf(InStr(cities, City) > 0, City, tmp(secnd))
End Function

includes 2-word city names

T.M.
  • 9,436
  • 3
  • 33
  • 57
  • 1
    I have a problem with city names that consist of more than one word. For example, without a list, how can one tell if, on line 5, the City is `Orange` or `West Orange` – Ron Rosenfeld Oct 13 '20 at 19:41
  • Appreciate your hint, please see edit with additional function. - @RonRosenfeld – T.M. Oct 14 '20 at 09:19
  • @T.M. This definitely looks promising. I get an error message "Subscript out of range (Error 9)" at this line ``` tmp = Split(Trim(tokens(UBound(tokens))) & " ", " ", 2)``` – Kristan Oct 14 '20 at 16:06
  • 1
    @Kristan Can you tell me the current string which raises the issue? - Possibly you have *empty* cells between the first and the last row. If so enclose the entire a) and b) blocks within the `doSplit()` procedure with the condition `If UBound(tokens) > 0 Then` ... `End If` (i.e. the `End If` line just before `Next i`) – T.M. Oct 14 '20 at 16:55
  • @T.M. This correction works. I no longer get the error. However, cities like New York, Baton Rouge, Salt Lake City, etc. come back split after the first word. The data actually has a hard return before each city name. There are other hard returns in the cell, so we would need to use the final hard return to delineate the street address from the city. Any updates to the code would be appreciated! – Kristan Oct 15 '20 at 21:42
  • As there seem to be return characters included, see @RonRosenfeld 's answer. Of course you could develop mine, too by adding replacements. – T.M. Oct 16 '20 at 06:54
1

With your comment that there is a return character to delineate the street address from the city, and the regular format of the addresses: street|City, State Zip Country the algorithm becomes much simpler as a series of Split functions can separate the address parts.

I also used a Type statement -- not necessary but makes the code clearer, IMO. Depending on the formatting, some of the Trim statements may not be necessary, but they won't hurt.

Note that you can change the ranges/sheets of your data Source and Results location to suit your specific requirements.

EDIT: I just read your comment that there might be multiple returns prior to the return setting off the city from the street address.

Code for .street altered accordingly

Option Explicit
Type Address
    street As String
    city As String
    state As String
    zip As String
    country As String
End Type
Sub splitAddresses()
    Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
    Dim vSrc As Variant, vRes As Variant
    Dim myAdr As Address
    Dim v, w, x, y
    Dim I As Long
    
Set wsSrc = Worksheets("sheet1")

'read into vba array for faster processing
With wsSrc
    vSrc = .Range(.Cells(2, 1), .Cells(.Rows.Count, 1).End(xlUp))
End With

Set wsRes = Worksheets("Sheet1")
    Set rRes = wsRes.Cells(1, 3)


ReDim vRes(0 To UBound(vSrc), 1 To 5)

'Headers
    vRes(0, 1) = "Street"
    vRes(0, 2) = "City"
    vRes(0, 3) = "State"
    vRes(0, 4) = "Zip"
    vRes(0, 5) = "Country"
    
For I = 1 To UBound(vSrc)
    v = Split(vSrc(I, 1), vbLf)
    With myAdr
        y = v
        ReDim Preserve y(UBound(y) - 1)
        .street = WorksheetFunction.Trim(Join(y, " "))

    w = Split(Trim(v(UBound(v))), ",")
        .city = w(0)
    
    x = Split(Trim(w(1)))
        .state = Trim(x(0))
        .zip = Trim(x(1))
        .country = Trim(x(2))
    
    vRes(I, 1) = .street
    vRes(I, 2) = .city
    vRes(I, 3) = .state
    vRes(I, 4) = .zip
    vRes(I, 5) = .country
End With

Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1) + 1, columnsize:=UBound(vRes, 2))
With rRes
    .EntireColumn.Clear
    .Value = vRes
    .Rows(1).Font.Bold = True
    .Columns(4).NumberFormat = "@"
    .EntireColumn.AutoFit
End With
    
Next I

End Sub

enter image description here

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • 1
    See recent edit to handle possibility of there being multiple returns prior to the one separating the street address from the city – Ron Rosenfeld Oct 16 '20 at 13:02
  • This looks like it will be helpful. I am however getting an error at the first Set. My data is in column G starting at cell G5. It is the same in all of the worksheets in the workbook. @Ron Rosenfeld – Kristan Oct 19 '20 at 16:28
  • @Kristan *"Note that you can change the ranges/sheets of your data Source and Results location to suit your specific requirements."* Review the changes that you made in the code to adapt it to your own setup. – Ron Rosenfeld Oct 19 '20 at 16:58
  • My apologies. I'm saying that I do not know how to adjust the code to have it apply to all sheets for the range of Column G. Can you help me with that? – Kristan Oct 19 '20 at 18:44
  • No, I do not. There are 23 worksheets in the workbook. All data is in Column G and begins on row 5. – Kristan Oct 20 '20 at 15:12
  • @Kristan First get it working on one sheet by setting the wsSrc, vSrc and rRes to the correct worksheets and ranges. – Ron Rosenfeld Oct 20 '20 at 17:12
  • I'm sorry. I am a beginner with VBA. I've tried to put my values in for my first worksheet "Behavioral Health" and my range "G5:G" and none of it is working correctly for me. – Kristan Oct 20 '20 at 18:47
  • @Kristan Hmmm. The code you provided looks like it was done by more than a beginner. You need to change the worksheet names and range addresses in the code. A worksheet name looks like "sheet1" and the ranges are defined by code that looks like `.Cell(rowNum,colNum)`. So you would replace `sheet1` with the name of your worksheet, and also change the row and column designator in the Cells property to reflect your location. Take a look at VBA help for the Range object and Cells property to better understand. – Ron Rosenfeld Oct 20 '20 at 18:55
  • I'm getting errors when I update the code as follows: `Set wsSrc = Worksheets("Behavioral Health") Set rRes = Range("G5:G") 'read into vba array for faster processing With wsSrc vSrc = .Range(.Cells(2, 1), .Cells(.Rows.Count, 1).End(xlUp)) End With Set wsRes = Worksheets("Behavioral Health") Set rRes = wsRes.Cells(1, 3)` – Kristan Oct 23 '20 at 16:08
  • @Kristan `wsSrc` & `wsRes` are correct. Since your data starts in `G5`, the range you need to read into `vSrc` has to start at `G5` (which could be represented as `Cells(5,7)`. So `vSrc = .Range(.Cells(5, 7), .Cells(.Rows.Count, 7).End(xlUp))`. And, since you don't want to overwrite your data with your results, if you are writing on the same worksheet, you need to set `rRes` to an appropriate location. e.g: `Set rRes = wsRes.Cells(1, 1)` would start the Results output in `A1`. If you wrote it to a different worksheet, you would change `wsRes` to reflect the name of the results worksheet. – Ron Rosenfeld Oct 23 '20 at 18:36
  • I get an error at `Type Address` with code update: `Sub splitAddresses() Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range Dim vSrc As Variant, vRes As Variant Dim myAdr As Address Dim v, w, x, y Dim I As Long Set wsSrc = Worksheets("Behavioral Health") Set rRes = Range("G5:G") 'read into vba array for faster processing With wsSrc vSrc = .Range(.Cells(5, 7), .Cells(.Rows.Count, 7).End(xlUp)) End With Set wsRes = Worksheets("Behavioral Health") Set rRes = wsRes.Cells(5, 8) ReDim vRes(0 To UBound(vSrc), 1 To 5)` – Kristan Oct 23 '20 at 19:58
  • @Kristan `I get an error` is fairly useless information. Is there some reason you don't want to indicate what the error is? Also, I don't understand why you have `Set rRes = Range("G5:G")`. What are you expecting to accomplish with that line? – Ron Rosenfeld Oct 23 '20 at 21:19
  • I get a compile error: cannot define a public user-error defined type within an object module at `Type Address`. I just removed `Set rRes = Range("G5:G")` from the code and get the same error as before. – Kristan Oct 26 '20 at 15:28
  • @Kristan Sounds like you may have your code in the wrong place. Ordinary VBA code should be be going into **regular** modules. Worksheet, Workbook and Class modules are generally reserved for specific uses -- and this is NOT one of them. – Ron Rosenfeld Oct 26 '20 at 17:43