1

Background

I know how to extract all numbers from an alphanumeric string using functions like TEXTJOIN() and SUBSTITUTE() etc. However, I'd be interested in a non-array entered way and thought it might be done through FILTERXML() as I'm trying to learn a bit more on the topic.

Excel's FILTERXML() function is available from Excel 2013 onwards (not online or Mac) and supports XPATH 1.0. Therefor I tried to implement the double translate() trick as can be found here.


Question

So how would one go about translating all non-numeric characters a string using the double translate() function through XPATH in Excel?

| Sample    | Desired Output |
|-----------|----------------|
| aBc1d5_x9 | 159            |
| 12:2&0    | 1220           |
| ABC123    | 123            |

I tried: =FILTERXML("<t><s>"&A2&"</s></t>","//s[translate(.,translate(.,'0123456789',''),'')]"). However, this returns the exact same value as A2 (nothing changed). Unsupported XPATH functions or faulty syntax normally would return a #VALUE error. So this looks like it is indeed supported yet somewhere in my syntax I went wrong, but I'm out of guesses.

JvdV
  • 70,606
  • 8
  • 39
  • 70

2 Answers2

2

Depending on your version of excel, you can use:

=CONCAT(IF(ISNUMBER(-MID(A2,SEQUENCE(LEN(A2)),1)),MID(A2,SEQUENCE(LEN(A2)),1),""))

enter image description here

I, too, have not been able to figure out how to use translate with FILTERXML. It seems to work OK with the online xpath testers (compatible with 1.0) but doesn't seem to work with Excel.

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
  • Thanks for testing Ron. Could this be a bug? Your formula is a neat alternative though. – JvdV May 11 '20 at 07:54
  • 1
    @JvdV I don't know. And I've not found MS documentation on using functions with `FILTERXML/xPath` arguments. But certain xPath 1.0 functions don''t seem to work even though they work ok on this [online xPath tester](https://www.freeformatter.com/xpath-tester.html). In your case, the syntax would be `//s/translate(.,translate(.,'0123456789',''),'')`, but that doesn't work with Excel. --> `#VALUE!` – Ron Rosenfeld May 11 '20 at 10:22
  • Playing around with `translate` lead me to believe we can only use `FILTERXML` to *select* nodes from XML. Transformation of data seems to not be allowed =( – JvdV May 11 '20 at 13:53
1

Disclaimer: This answer is not about XPath, but provides a one-cell, non-array entered solution to the problem.

With Office 365 there no longer is a need to array-enter formulas. With the new Filter() function, everything can be done in one cell.

Using the new Let() function (currently in Insider preview), the formula can be a lot shorter than previously required.

=LET(MyText,MID(A1,ROW(INDIRECT("A1:A"&LEN(A1))),1),
TEXTJOIN("",TRUE,FILTER(MyText,(CODE(MyText)<=57)*(CODE(MyText)>=48))))+0

enter image description here

If you don't yet have LET(), use the longer version.

=TEXTJOIN("",TRUE,FILTER(MID(A1,ROW(INDIRECT("A1:A"&LEN(A1))),1),(CODE(MID(A1,ROW(INDIRECT("A1:A"&LEN(A1))),1))<=57)*(CODE(MID(A1,ROW(INDIRECT("A1:A"&LEN(A1))),1))>=48)))+0

This will also work in Excel online and mobile devices.

If you want the result to be text, remove the +0 at the end of the formula.

teylyn
  • 34,374
  • 4
  • 53
  • 73