8

In an MS Access 2007 project report, I have the following (redacted) query:

SELECT SomeCol FROM SomeTable

The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.

The column is set to TEXT. I have no control over this data type, so I can't change it.

How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?

Danny Beckett
  • 20,529
  • 24
  • 107
  • 134

4 Answers4

11

You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.

Public Function OnlyDigits(ByVal pInput As String) As String
    Static objRegExp As Object

    If objRegExp Is Nothing Then
        Set objRegExp = CreateObject("VBScript.RegExp")
        With objRegExp
            .Global = True
            .Pattern = "[^\d]"
        End With
    End If
    OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function

Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)

? OnlyDigits("x1x23x")
123

If that is the output you want, just use the function in your query.

SELECT OnlyDigits(SomeCol) FROM SomeTable;
HansUp
  • 95,961
  • 11
  • 77
  • 135
  • After seeing your answer, I agree it's better, since it doesn't require every character to be looped through (useful when theres up to 9 characters per string, and 9,500 rows). Thanks! – Danny Beckett Sep 25 '12 at 11:07
2

There is no RegEx in Access, at least not in SQL. If you venture to VBA, you might as well use a custom StripNonNumeric VBA function in the SQL statement.

e.g. SELECT StripNonNumeric(SomeCol) as SomeCol from SomeTable

Function StripNonNumeric(str)
      keep = "0123456789"
      outstr = ""
      For i = 1 to len(str)
          strChar = mid(str,i,1)
          If instr(keep,strChar) Then
              outstr = outstr & strChar
          End If
      Next
      StripNonNumeric = outstr
  End Function
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • This is perfect! Just for anyone else who happens upon this question. Simply go to the Create tab, and where it says Macro, press the arrow and select Module. Copy this code into the new module, save and close. Then use as `SELECT StripNonNumeric(SomeCol) FROM SomeTable`. Very happy, thank you Richard! – Danny Beckett Sep 24 '12 at 11:44
  • 2
    +1 As you say **RegEx** isn't available in SQL but can be used in Access VBA. As a slight change although I don't imagine it would either increase or decrease the run time, you could always drop the **keep** string and use the built in **IsNumeric** function instead. – Matt Donnan Sep 24 '12 at 11:52
0

You can do it all in a query, combining this question with your previous question, you get:

SELECT IIf(IsNumeric([atext]),
           IIf(Len([atext])<4,Format([atext],"000"),
               Replace(Format(Val([atext]),"#,###"),",",".")),
           IIf(Len(Mid([atext],2))<4,Format(Mid([atext],2),"000"),
               Replace(Format(Val(Mid([atext],2)),"#,###"),",","."))) AS FmtNumber
FROM Table AS t;
Community
  • 1
  • 1
Fionnuala
  • 90,370
  • 7
  • 114
  • 152
  • `SELECT VAL(SomeCol) FROM SomeTable WHERE SomeCol = '123456'` returns `0` (where there is some invisible character before the 1 - the WHERE by itself works) – Danny Beckett Sep 24 '12 at 12:03
  • The first `IIf(IsNumeric([atext]` means that `val` is applied to the text if the text does not have an odd start character, `val` is applied to `Mid([atext],2)`, that is, remove the start character, when the text does have an odd, non-numeric, start character. The above sql was tested. – Fionnuala Sep 24 '12 at 12:06
  • And if the character is not at position 1? What then? It might be in the middle of the string, e.g. – Danny Beckett Sep 24 '12 at 12:18
0
Public Function fExtractNumeric(strInput) As String
' Returns the numeric characters within a string in
' sequence in which they are found within the string
Dim strResult As String, strCh As String
Dim intI As Integer
If Not IsNull(strInput) Then
    For intI = 1 To Len(strInput)
        strCh = Mid(strInput, intI, 1)
        Select Case strCh
            Case "0" To "9"
                strResult = strResult & strCh
            Case Else
        End Select
    Next intI
End If
fExtractNumeric = strResult

End Function

Masum
  • 119
  • 11