2

The matrix management feature in Google Spreadsheets is convenient and combined with functions that handle regular expressions like REGEXMATCH and REGEXEXTRACT, among others, makes it especially useful.

I have a case in which I do not know what is due that does not give the expected result. Here's what I'm trying to do:

Spreadsheet Settings
Regional Settings: Mexico, use . (dot) as decimal separator.

Entry
A1: abcde

Formula
B1: =ArrayFormula(REGEXEXTRACT(A1,{".{1}",".{2}"}))

Expected result
B1: a
B2: ab

Obtained result
B1: a
B2:

Known workaround
=ArrayFormula(TRANSPOSE(REGEXEXTRACT(A1,{".{1}";".{2}"})))

This question also has being posted on the Spanish site -> https://es.stackoverflow.com/q/55704/65

Rubén
  • 34,714
  • 9
  • 70
  • 166
  • Related: http://stackoverflow.com/q/26137115/1595451. See the comment at the end of the [answer](http://stackoverflow.com/a/26138109/1595451) by AdamL. – Rubén Mar 16 '17 at 05:44

2 Answers2

4

Quoting Jean-Pierre Verhulst on a similar case in REGEXEXTRACT Array Mysteriously Stopped Working Today (Google Docs Help Forum):

The team is well aware of the issue and a fix should be there soon.


Coincidentally, it was published Jan 4th, 2017, the same date AdamL modified his answer to ARRAYFORMULA() does not work with SPLIT(), explaining that:

REGEXEXTRACT no longer appears to support an array for the second argument.



We can conclude that this behaviour in ArrayFormula is due to a modification in Google Sheets, allowing SPLIT in array formulas, with the consequence of REGEXEXTRACT not accepting multiple columns as input in the regex.

This is probably because REGEXEXTRACT, with multiple capture groups in the regular expression, yields an horizontal array, one cell for each group. Having an array as argument, the behaviour may be undefined, but that is plain argumentative on my side.

Community
  • 1
  • 1
Mariano
  • 6,423
  • 4
  • 31
  • 47
  • 1
    Translated from my answer to [¿A qué se debe que REGEXEXTRACT devuelve un único valor cuando se espera que devuelva una matriz de 1 renglón y varias columnas?](http://es.stackoverflow.com/a/56089/127) – Mariano Mar 17 '17 at 17:16
3

The reason your not seeing expected results in your particular formula, is the order in which you are using arrayformula and regexextract - you need to either modify your regex syntax to extract 2 groups, or you need to make an array to separate each regex extract function.

There are a few ways to do it, 1 way is to create a literal array and just specify the 2 extract patterns:

={REGEXEXTRACT(A1,"^."),REGEXEXTRACT(A1,"^.{2}")}

enter image description here

The other is to create 2 capture groups, only thing about the second one is by default is comes back in reversed order, but you can easily swap that by sort():

=REGEXEXTRACT(A1,"^((.).)")

enter image description here

Aurielle Perlmann
  • 5,323
  • 1
  • 15
  • 26
  • This is helpful b/c shares other workarounds but I'm not sure about accepting this answer b/c it doesn't explain if there is an error on the formula/regular-expression (user side) or if there is an issue with the REGEXEXTRACT implementation (Google side) – Rubén Mar 16 '17 at 21:13
  • updated - I believe its more so in the order of which your nesting your functions – Aurielle Perlmann Mar 16 '17 at 23:23