0

I am looking for a way to use a regex as value in a named parameter in the Java SDK. According to the documentation, there is no datatype for that, and using a String parameter does not work.

Is there any way to use a regex as value in a named parameter?

QueryParameterValue Class has no datatype for that: https://googleapis.dev/java/google-cloud-clients/0.91.0-alpha/com/google/cloud/bigquery/QueryParameterValue.html#of-T-com.google.cloud.bigquery.StandardSQLTypeName-

A regex in the query would e.g. look like this:

REGEXP_CONTAINS(some_attribute, r"^any\sregex\ssearchstring$") 

and should be replaced by a named parameter like:

REGEXP_CONTAINS(some_attribute, @named_regex_parameter)

I tried different syntax in the query like

REGEXP_CONTAINS(some_attribute, r@named_regex_parameter)

etc. but none of them worked. The @named_regex_parameter is of type String. I tried to use values in the form of r"regex_expression" and just the regex_expression in the parameter value.

Seems like I need to build the query String without a named parameter for the regex part. Any hints to solve this with parameters would be really appreciated!

//Edit: added code example how the named parameters are used in the query config

QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query)

            .setDestinationTable(TableId.of(destinationDataset, destinationTable))
            .setAllowLargeResults(true)
            .setCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
            .setTimePartitioning(partitioning)
            .setWriteDisposition(WriteDisposition.WRITE_APPEND)
            .setParameterMode("NAMED")
            .addNamedParameter("regexExpressionParam", QueryParameterValue.string(someRegexExpressionStringVariable)) //this does not work
            .addNamedParameter("someStringParam", QueryParameterValue.string(stringVariable))
            .setPriority(Priority.BATCH)
            .build();

The query should use the parameter @regexExpressionParam like so:

REGEXP_CONTAINS(theAttributeToQuery, @regexExpressionParam)) 
fire
  • 149
  • 1
  • 5
  • What exactly you want to achieve. Could you elaborate what you tried? It was inside the BigQuery or did you use CLI? What error did you get? – PjoterS Sep 10 '21 at 10:54
  • @PjoterS I added an excerpt of the code. If I e.g. put the regex inside the string value in the form of r"THE_REGEX" there is no error message but the expression will never match – fire Sep 13 '21 at 10:52

1 Answers1

1

You need to pass the regular expression string without r'...'

I had a very similar problem with running parameterized queries on Python: it was something like this.


from google.cloud import bigquery

regex_input = "^begin_word.*end_here$"

# Construct a BigQuery client object.
client = bigquery.Client()

query = """
    SELECT word, word_count
    FROM `bigquery-public-data.samples.shakespeare`
    WHERE REGEXP_CONTAINS(word, @regex)
    ORDER BY word_count DESC;
"""
job_config = bigquery.QueryJobConfig(
    query_parameters=[
        bigquery.ScalarQueryParameter("regex", "STRING", f"r'{regex_input}'"),
    ]
)
query_job = client.query(query, job_config=job_config)

At first, I thought the input had to be wrapped by r'...'; just like how I normally write a regex on BQ explorer. I tried to modify the string input to make it like a regular expression, which was this pard f"r'{regex_input}'" of the code.

but apparently BQ correctly escapes string without our help and I can just pass down the regex string like bigquery.ScalarQueryParameter("regex", "STRING", regex_input)

tawago
  • 11
  • 1
  • is correct. You don't need to use the r'...' the way one does in BQ UI. I was struggling with this for quite a time because I was trying to mix named parameters along with regex insensitive case. – Henrique Poleselo Aug 24 '22 at 12:01