1

I started using 'ANY()' function in query instead of 'IN' due to some parameter bound error. Currently it's something like that.

Select * 
FROM geo_closure_leaf 
WHERE geoId = ANY(:geoIds)

But it has a huge impact on performance. Using the query with IN is very much faster than with ANY.

Any suggestion how can we bound array of string parameters can be passed in 'IN' expression.

I have tried temporary fix using

Select * 
FROM geo_closure_leaf 
WHERE geoId IN (''('' || array_to_string(:geoIds::text[] ,''),('') || '')'')

Select * 
FROM geo_closure_leaf 
WHERE geoId IN (select unnest(:geoIds::text[]))

geoIds = array of strings

It's working this way.

**public override T Query<T>(string query, IDictionary<string, object> parameters, Func<IDataReader, T> mapper)**
        {
            T Do(NpgsqlCommand command)
            {
                IDataReader reader = null;
                try
                {
                   ** command.CommandText = query;
                    reader = command.AddParameters(parameters).ExecuteReader();**
                    return mapper(reader);
                }
                finally
                {
                    CloseDataReader(reader);
                }
            }

            return Execute(Do);
        }

Object is array of string.

Expected is: I should be able to do this without having to put extra logic in sql.

Select * 
FROM geo_closure_leaf 
WHERE geoId IN (:geoIds)
Ved
  • 11
  • 4
  • 3
    PostgreSQL internally transforms `IN` into `= ANY`, so that is surprising. Can we have `EXPLAIN (ANALYZE, BUFFERS)` output for both queries? – Laurenz Albe Mar 25 '19 at 12:04
  • With IN operator `"Seq Scan on geo_closure_leaf (cost=0.00..12.13 rows=2 width=436) (actual time=0.000..0.000 rows=0 loops=1)" " Filter: (("geoId")::text = ANY ('{a,b}'::text[]))" "Planning time: 0.075 ms" "Execution time: 0.022 ms" ` With Any operator any idea how can i run the query in pgadmin? – Ved Mar 26 '19 at 12:01
  • This solves my problem and much cleaner than earlier `Select * FROM geo_closure_leaf WHERE geoId IN (select unnest(:geoIds::text[]))` – Ved Mar 26 '19 at 12:03
  • @LaurenzAlbe I believe it was in PostgreSQL 9.2 that they changed it so that `IN` is transformed into `= ANY` and is able to use indexes. Earlier versions do not use indexes when using arrays. The problem is likely the use of an old server. – coladict Mar 27 '19 at 10:22
  • @coladict I believe it was added in 8.2 with [this commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3d376fce8dd45d43fb6dbeb5a08c08400a589ff8). – Laurenz Albe Mar 27 '19 at 10:39
  • https://www.postgresql.org/docs/9.2/release-9-2.html#AEN114667 `Allow indexed_col op ANY(ARRAY[...]) conditions to be used in plain index scans and index-only scans. Formerly such conditions could only be used in bitmap index scans.` – coladict Mar 27 '19 at 10:42

2 Answers2

1

The performance difference cannot be IN versus = ANY, because PostgreSQL will translate IN into = ANY during query optimization.

The difference must be the subselect. If you are using unnest, PostgreSQL will always estimate that the subquery returns 100 rows, because that is how unnest is defined.

It must be that the estimate of 100 somehow produces a different execution plan that happens to work better.

We'd need the complete execution plans to say anything less uncertain.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
0

https://dba.stackexchange.com/questions/125413/index-not-used-with-any-but-used-with-in

Found this post explaining how indeexs are getting used in different constructors of 'ANY' & 'IN'.

Ved
  • 11
  • 4