0
Requirement

I wish to manipulate string values in a LibreOffice Basic macro using regular expression replacement? I should like to use something like y = regex (x, "^¿(.*)\?$", "$1") to operate on string variables (rather than cells of a document). The macro is to be used from LibreOffice Calc.

This is not for the above (trivial) problem, but a possibility that I frequently miss.

Apparent status
Only search exposed to scripts?

Of course the LibreOffice engine performs regular expression replacement (both in interactive Find/Replace and in the REGEX function). However, it seems that only regular expression search is exposed to Basic – and that is not what I am looking for. (It is provided by the somewhat cumbersome TextSearch service, which (if I recall correctly) can also search in string values in Basic variables.)

Spreadsheet functions: do not work

I thought one was meant to be able to call Calc functions in LO Basic, but calls like

z   = regex (x, "^¿(.*)\?$", "$1")
z   = regex (x, "^¿(.*)\?$", "$1", "g")

just said “Basic runtime error. Sub-procedure or function procedure not defined”.

Hack

As a horrible hack, one could manipulate some cells containing appropriate formulae in a scratch part of my sheet, but that has so many evident disadvantages that I would only use it in extremis!

Various sources

The question Using regex in a libreoffice calc macro to extract text from parentheses in a cell is about using REs to find text in cells.

Searching the API documentation for “regular” only threw up the TextSearch Service mentioned above.

Several forums have posts like this that only address search; I have found nothing about replacement.

PJTraill
  • 1,353
  • 12
  • 30
  • 1
    See [**HERE**](https://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=19149) – JohnSUN May 31 '21 at 16:40
  • @JohnSUN: That is only about _search_. I have clarified my question to make clearer that I want RE **replacement**. – PJTraill Jun 02 '21 at 11:54
  • Well, use `z = createUnoService("com.sun.star.sheet.FunctionAccess").callFunction("REGEX", Array(x,"^¿(.*)\?$", "$1", "g"))` like as [**here**](https://stackoverflow.com/a/49429556/14094617) (thanks @Jim K!) – JohnSUN Jun 02 '21 at 13:12
  • @JohnSUN: Thanks for the tip, which sounds like an interesting technique I have not yet come across – I shall try it out and report back. – PJTraill Jun 03 '21 at 12:50
  • @JohnSUN: I have just tried it out, and it worked for me. I wonder if `createUnoService` is at all expensive, but I do not suppose it would matter much in most contexts. – PJTraill Jun 03 '21 at 14:53
  • Yes, it doesn't matter - you can create it once, store it in a variable, and reuse it multiple times. The REGEX() function itself is a bit slow (like all functions that have to process text strings), but it won't be noticeable either. – JohnSUN Jun 03 '21 at 15:04
  • A way of doing it without using worksheet functions would be to extract the non-matching sub-strings using the `com.sun.star.util.TextSearch` service and then stitch them together with the replacement text in-between. It may be fiddly to do but once encapsulated in a function no longer so. – Howard Rudd Sep 09 '21 at 10:53
  • @HowardRudd: While that technique may be applicable to many specific cases, I do not think that it is feasible for arbitrary RE replacements (with a given RE syntax), unless you build a very complex RE replacement parser; one risks creating yet another dialect for RE replacements; one would not benefit from any future extensions to the RE replacement syntax used by Calc – PJTraill Sep 09 '21 at 16:52
  • I wasn't suggesting the construction of a new regex to match the parts of the string not matched by the original regex. The functions provided by the `TextSearch` service return the positions of the first character and one after the last character of each match. By definition the boundaries of the matches are also the boundaries of the non-matches, so you can use them to extract the non-matches from the string just as easily as the matches. – Howard Rudd Sep 10 '21 at 06:30
  • @HowardRudd: How do you deal with capture groups? I just checked the [ICU RE docs](https://unicode-org.github.io/icu/userguide/strings/regexp.html) – I see that under **Find and Replace** there are two constructs available in the replacement: `$` _n_ | `${` _name_ `}` and ` \\ ` | `\$`, which serve to specify captured groups and quoted characters. As I understand the [result of `TextSearch`](https://www.openoffice.org/api/docs/common/ref/com/sun/star/util/SearchResult.html) it only returns matches of the full RE, not of capture groups. – PJTraill Sep 11 '21 at 10:20
  • @HowardRudd: Also rolling ones own would mean missing out on any future enhancements: for starters I could image `${` _name_ `/u}` to insert a capture group converted to upper case. – PJTraill Sep 11 '21 at 10:24
  • `TextSearch` can cope with capture groups within the regular expression, e.g. to find repeated words, but not in the replacement text, since it doesn't have any functionality for returning the content of captured groups. You can replace "The cow cow jumped over the moon moon" with "The dog jumped over the dog" but not "The cow jumped over the moon". However, the spreadsheet function `REGEX` does allow this. So `=REGEX("The cow cow jumped over the moon moon","(\s\S+)\1","$1","g")` returns "The cow jumped over the moon". So it looks like the spreadsheet function is the way to go. – Howard Rudd Sep 12 '21 at 06:17

1 Answers1

1

A spreadsheet demonstrating this function is at RegEx Demo.ODS

Function Digits( strVal As String ) As String ' return only the digits from the input
  Const cnsSvcNam    = "com.sun.star.sheet.FunctionAccess" ' UnoService for calling      spreadsheet functions
  Const cnsShtFNm    = "REGEX"                             ' RegExp Search+Replace: what spreadsheet function we will call
' Const cnsPtnDgtYes = "[:digit:]"                         ' RegExp Search+Replace: named regular expression for yes-digits
  Const cnsPtnDgtNon = "[^0-9]"                            ' RegExp Search+Replace: named regular expression for not-digits
  Const cnsStrMpt    = ""                                  ' empty string
  Const cnsFlgGlb    = "g"                                 ' RegExp Search+Replace: flag for Global operation

  Dim strRtn   As String  ' function result
  ' variables for calling REGEX( Text ; Expression [ ; [ Replacement ] [ ; Flags|Occurrence ] ] ) ' spreadsheet function signature
  Dim strSch   As String  ' RegExp Search+Replace: text       to search in
  Dim strExp   As String  ' RegExp Search+Replace: expression to match  on
  Dim strRpl   As String  ' RegExp Search+Replace: string     to insert as replacements
  Dim strFlg   As String  ' RegExp Search+Replace: flags / occurrence
  Dim arrFnc() As String  ' paramter-values to pass as arguments to spreadsheet function (array of strings)

  ' Drop all non-digit characters from input
' strRtn = RegEx( strVal, "[:digit:]", "", "g" ) ' RegEx is not defined in scope of Basic code, but it *is* defined as a spreadsheet function
  strSch = strVal                                                          ' text    to search in
  strExp = cnsPtnDgtNon                                                    ' pattern to match  on: match all non-digits
  strRpl = cnsStrMpt                                                       ' replacement data    : replace with nothing
  strFlg = cnsFlgGlb                                                       ' operation flags     : do this for all instances found
  arrFnc = Array( strSch, strExp, strRpl, strFlg )                         ' populate parameter values / arguments to   spreadsheet function
  strRtn = createUnoService( cnsSvcNam ).callFunction( cnsShtFNm, arrFnc ) ' execute                                    spreadsheet function

  Digits = strRtn                                                          ' pass result back to caller
End Function ' Digits
Bilbo
  • 358
  • 1
  • 10
  • What is that site, which asks me to log in? This is basically JohnSUN’s comment, but having it in an answer is useful. I have preferred to factor out all the kerfuffle into an extra function and and use that to make wrappers for those spreadsheet functions I find I want. – PJTraill Sep 28 '22 at 17:18
  • Aha: https://en.wikipedia.org/wiki/Mega_(service) tells us that it offers free encrypted distributed cloud storage. That article sounds in at least one place like advertising, but interesting. – PJTraill Sep 28 '22 at 17:24
  • Yeah I looked at the file-attachment sites that SO recommended: one was not-found and the other was hot garbage. Can anyone suggest a fileshare host that does not want a login and not Ggl or M$ft? – Bilbo Sep 29 '22 at 15:51