7

I have a table with a jsonb column where I store variable data. I would like to search this column and also find fragments (leading or trailing whitespace). I think I know how to do this with text columns but cannot wrap my head around how to achieve this with jsonb columns.

There are two scenarios that I would like to achieve:

  1. Search a specific key inside the jsonb column only (for example data->>company)
  2. Search the whole jsonb column

For text columns I generate gin indexes using pg_trgm.

Install extension pg_trgm:

CREATE extension if not exists pg_trgm;

Create table & index:

CREATE TABLE tbl (
    col_text  text,
    col_json  jsonb
);

CREATE INDEX table_col_trgm_idx ON tbl USING gin (col_text gin_trgm_ops);

Example query:

SELECT * FROM tbl WHERE col_text LIKE '%foo%';   -- leading wildcard
SELECT * FROM tbl WHERE col_text ILIKE '%foo%';  -- works case insensitive as well

Trying the same with the jsonb column fails. If I try to index the whole column

CREATE INDEX table_col_trgm_idx ON tbl USING gin (col_json gin_trgm_ops);

I get the error

ERROR (datatype_mismatch): operator class "gin_trgm_ops" does not accept data type jsonb

(Which makes sense). If I try to index just one key of the jsonb column I also receive an error:

CREATE INDEX table_col_trgm_idx ON tbl USING gin (col_json->>company gin_trgm_ops);

Error:

ERROR (syntax_error): syntax error at or near "->>"

I used this answer by @erwin-brandstetter as a reference. Any help is highly appreciated (and no, I don't want to implement Elasticsearch as of now :) ).


Edit: Creating the index like this actually works:

CREATE INDEX table_col_trgm_idx ON tbl USING gin ((col_json->>'company') gin_trgm_ops);

And querying it also doesn't lead to an error:

SELECT * FROM tbl WHERE col_json->>'company' LIKE '%foo%'; 

But the result is always empty.

Community
  • 1
  • 1
Ole Spaarmann
  • 15,845
  • 27
  • 98
  • 160
  • Your code works for me. – Michas Nov 03 '16 at 15:44
  • @Michas But do you also get results? – Ole Spaarmann Nov 03 '16 at 15:51
  • Cross post: http://dba.stackexchange.com/questions/154164/postgresql-pg-trgm-full-text-search-with-jsonb-columns –  Nov 03 '16 at 15:56
  • Sorry. Bit desperate. Deleted the other post – Ole Spaarmann Nov 03 '16 at 15:57
  • @ØleBjarnstroem I get results and the query plan use the index. Do You have entities in table? – Michas Nov 03 '16 at 16:01
  • I have entries. I search for a value that is in the database. And I even ran `REINDEX TABLE tbl;` just to make sure. Result is still 0 rows... – Ole Spaarmann Nov 03 '16 at 16:10
  • @Michas: I just started to wonder that maybe the data in my column is malformed since I use postgresql together with Elixir and Ecto seems to have a weird way of using the jsonb column... – Ole Spaarmann Nov 03 '16 at 16:26
  • @ØleBjarnstroem Try `SELECT col_json::text FROM tbl LIMIT 10;` I hope this will be enough to debug data. – Michas Nov 03 '16 at 16:29
  • I get this result: `"{\"website\":\"Website Content\",\"company\":\"Company Content\"}` - Is it possible that the content was escaped by Elixir and not strictly jsonb? – Ole Spaarmann Nov 03 '16 at 16:34
  • 1
    Case closed. I was using a custom Ecto DataType on this column, encoding and decoding content with Poison. This was due to the fact that Ecto back then didn't support jsonb and this was a (bad) way to implement it. I will try with a regular jsonb column and report back. Thank you for your help. – Ole Spaarmann Nov 03 '16 at 17:00

0 Answers0