2

I want to parse out the year info from a string like this one

$8995 Apr 18 2008 Honda Civic Hybrid $8995 (Orem) pic map cars & trucks - by owner

Since I retrieve this string online, sometimes the year element is not at the same place. The way I do it is to split the string by space using split function, then check if each node of the array contains only numeric digits.

However when i use the function IsNumeric, it also returns "$8995" node as true as well.

What is a good way to check if a string contains only numbers, no "$", no ".", not anything else?

Or in my situation, is there a better way to retrieve the year information?

Thanks.

TitanTheYaphet
  • 113
  • 1
  • 4
  • 15
  • 1
    You could check Isnumeric and value between 1998 and current year... – Sparky Apr 20 '14 at 01:32
  • I was thinking that what if they only entered like 06 or 07, but i guess it works by check if the value between 00 to 14 or if the value between 60 to 99 – TitanTheYaphet Apr 20 '14 at 01:39
  • @Sparky but it still doesnt solve the problem that what if a car is only $1999 and i cannot tell the price from the year apart. – TitanTheYaphet Apr 20 '14 at 01:41
  • I think regular expressions might be the way to go. You'll probably want to check for multiple patterns in order of likelihood, then pick the first one that matches. This might be useful: http://stackoverflow.com/questions/19481175/general-purpose-udfs-for-using-regular-expressions-in-excel – maybeWeCouldStealAVan Apr 20 '14 at 02:18
  • You also could have difficulty if they didn't include the $ sign. You can isolate all the numbers like you currently do, they write some business logic to "guess" which is the year and which is the price. That business logic might be several rules, first of which is, does one of the numeric fields contain a $ sign. It would be difficult to extract the year only from, what about an Audi 500? – Sparky Apr 20 '14 at 02:34

4 Answers4

4

This can be accomplished as a single line of code, using the Like operator

Function StringIsDigits(ByVal s As String) As Boolean
    StringIsDigits = Len(s) And (s Like String(Len(s), "#"))
End Function
sam-6174
  • 3,104
  • 1
  • 33
  • 34
1

Will it be the case that all the strings with "years" will have substrings that look like dates? If that is the case, you could just cycle through the string looking for the first group of three that looks like a date, extracting the year from that:

Option Explicit
Function FindYear(S As String) As Long
    Dim SS As Variant
    Dim sDate As String
    Dim I As Long, J As Long

SS = Split(S, " ")
For I = 0 To UBound(SS) - 2
    sDate = ""
    For J = 0 To 2
        sDate = " " & sDate & " " & SS(I + J)
    Next J
    sDate = Trim(sDate)
    If IsDate(sDate) Then
        FindYear = Year(sDate)
        Exit Function
    End If
Next I
End Function
Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
0

WIthout using Regular Expressions or some very complicated logic, it's going to be difficult to be perfect.

This code will return the pure numeric substrings, but in the case of your example it will return "18" and "2008". You could obviously try to add some more logic to disallow "18" (but allow "13" or "09", etc., but like I said that starts getting complicated. I am happy to help with that, but not knowing exactly what you want, I think it's best to leave that up to you for now.

Const str$ = "$8995 Apr 18 2008 Honda Civic Hybrid $8995 (Orem) pic map cars & trucks - by owner"
Option Explicit
Sub FindNumericValues()

Dim var() As String
Dim numbers As Variant

var = Split(str, " ")

numbers = GetNumerics(var)

MsgBox Join(numbers, ",")

End Sub

Function GetNumerics(words() As String) As Variant
Dim tmp() As Variant
Dim i As Integer
Dim n As Integer
Dim word As Variant
Dim bNumeric As Boolean

For Each word In words
    n = 0
    bNumeric = True
    Do While n < Len(word)
    n = n + 1
        If Not IsNumeric(Mid(word, n, 1)) Then
            bNumeric = False
            Exit Do
        End If
    Loop
    If bNumeric Then
        ReDim Preserve tmp(i)
        tmp(i) = word
        i = i + 1
    End If
Next

GetNumerics = tmp
End Function
David Zemens
  • 53,033
  • 11
  • 81
  • 130
0

You could parse the year out using RegEx:

Public Function GetYear(someText As String) As Integer
    With CreateObject("VBScript.RegExp")
        .Global = False
        .MultiLine = False
        .IgnoreCase = True
        .Pattern = " [\d]{4} "
        If .Test(testString) Then
            GetYear = CInt(.Execute(testString)(0))
        Else
            GetYear = 9999
        End If
    End With
End Function

Example code:

Public Const testString As String = "$8995 Apr 18 2008 Honda Civic Hybrid $8995 (Orem) pic map cars & trucks - by owner "

Public Function GetYear(someText As String) As Integer
    With CreateObject("VBScript.RegExp")
        .Global = False
        .MultiLine = False
        .IgnoreCase = True
        .Pattern = " [\d]{4} "
        If .Test(testString) Then
            GetYear = CInt(.Execute(testString)(0))
        Else
            GetYear = 9999
        End If
    End With
End Function

Sub Foo()
    Debug.Print GetYear(testString) '// "2008"
End Sub
SierraOscar
  • 17,507
  • 6
  • 40
  • 68