-1

I want to search for a string that is inside of a column of some table. The system has about 200 tables. I need to search all columns of all tables to achieve what I want.

Is it possible in Postgresql 12?

Diego Alves
  • 2,462
  • 3
  • 32
  • 65
  • 1) Write a script that loops over all the tables and columns. 2) Dump the data with `pg_dump -a -Fp` and search in the dump file. – Adrian Klaver Jul 25 '23 at 15:29
  • relational databases are not designed to achieve this type of search. I don't know any easy solution to this problem only using SQL. I would suggest you write a script which loops through all tables and columns and executes a LIKE or Regex search on each column. – Silvan Jul 25 '23 at 15:56
  • If you don't care how long it takes, then construct queries like `select * from t where t::text ilike '%%'`. Loop through `information_schema.tables` to build this query. You will have to eyeball the results or come up with another set of queries to figure out which column contains it. – Mike Organek Jul 25 '23 at 17:45
  • Are you trying to find just the tables and columns that the string occurs in; the tables, rows, and columns; or the tables, rows, columns, and values? What should the output look like? – JohnH Jul 25 '23 at 23:11
  • There's no way using plain SQL. You could export the database and use CLI tools like grep. Or write a script to query the table names from the catalog and `select *` from them and check the col values,, or do it in pl/sql (harder). – Bohemian Jul 26 '23 at 01:23
  • @Bohemian, this is certainly possible in PostgreSQL using plain SQL; however, depending on the amount of data, it might not be practical. I've often used variations of the technique I present in my answer. – JohnH Jul 26 '23 at 17:21

1 Answers1

0

If all that is needed are the tables and columns in which the value occurs, then the following query can be used:

WITH
parameters(search_pattern, schemas) AS (
  VALUES ('regex', '{public}'::text[])),
text_columns AS (
  SELECT t.table_schema,
         t.table_name,
         c.column_name
    FROM parameters p
    JOIN information_schema.tables t
      ON     t.table_type = 'BASE TABLE'
         AND t.table_schema = ANY(p.schemas)
    JOIN information_schema.columns c
      ON     t.table_schema = c.table_schema
         AND t.table_name = c.table_name
    WHERE     c.data_type IN ('character', 'character varying', 'text'))
SELECT t.table_schema,
       t.table_name,
       t.column_name
  FROM parameters p
  CROSS JOIN text_columns t
  WHERE (xpath('/table/row/has_string/text()',
              query_to_xml(format($$SELECT true AS has_string FROM %I.%I WHERE %I ~ %L LIMIT 1$$,
                                  t.table_schema, t.table_name, t.column_name, p.search_pattern),
                           false, false, '')))[1]::text::boolean
  ORDER BY t.table_schema, t.table_name, t.column_name;

Replace regex with a regular expression that will be true if the desired string occurs within the column and expand the schema list if needed.

The query works by searching each character type column of each table for the pattern. For efficiency, the search of each column terminates as soon as a match is found.

JohnH
  • 2,001
  • 1
  • 2
  • 13