3

I am trying to run the following query on Google Big Query.

It shows as valid in the console, when running it, I get this error message:

"Array cannot have a null element; error in writing field result"

How do resolve this?

standardSQL

SELECT
  ARRAY(
    SELECT regexp_extract(x, r'^(.*?)\:')
    FROM UNNEST(split(V2Themes,',')) AS x
  ) AS result
FROM `gdelt-bq.gdeltv2.gkg_partitioned`
WHERE _PARTITIONTIME BETWEEN TIMESTAMP('2019-02-02') AND TIMESTAMP('2019-02-02')
Son
  • 159
  • 1
  • 11
  • Possible duplicate [here](https://stackoverflow.com/questions/42584478/how-to-exclude-nulls-from-array-so-query-wont-fail) – realr Jul 26 '19 at 14:55

3 Answers3

2

Remove the NULL value:

ARRAY(
    SELECT regexp_extract(x, r'^(.*?)\:')
    FROM UNNEST(split(V2Themes,',')) AS x
    WHERE regexp_extract(x, r'^(.*?)\:') IS NOT NULL AND 
  )
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Below is for BigQuery Standard SQL

#standardSQL
SELECT * FROM (
  SELECT
    ARRAY(
      SELECT REGEXP_EXTRACT(x, r'^(.*?)\:')
      FROM UNNEST(SPLIT(V2Themes,',')) AS x
      WHERE REGEXP_CONTAINS(x, r'^(.*?)\:') -- << removes nulls from array
    ) AS result
  FROM `gdelt-bq.gdeltv2.gkg_partitioned`
  WHERE _PARTITIONTIME BETWEEN TIMESTAMP('2019-02-02') AND TIMESTAMP('2019-02-02')
)
WHERE ARRAY_LENGTH(result) > 0  -- << filters out output with empty array 
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
0

maybe you want more than just first character before \:? here is some tests

select
    x,
    regexp_extract(x, r'^(.*?)\\:') as only_first_occurrence_nongreedy,
    regexp_extract(x, r'^(.*)\\:') as only_first_occurrence_greedy,
    regexp_extract_all(x, r'(.*?)\\:') as all_occurrences_nongreedy,
    regexp_extract_all(x, r'(.*?)\\:') as all_occurrences_greedy  -- same as regexp_extract, but returns array
from
    unnest(
        array[
            struct(r'abc\:' as x),
            struct(r'123\:abc\:'),
            struct(r'\:'),
            struct(r'\\')
        ]
    )
Y.K.
  • 682
  • 4
  • 10