1

I'm trying to figure when temporary tables are created and deleted.

I thought that a temporary table is dropped at the end of a session, but this test makes me unsure of that.

The connection string has Pooling=true;Maximum Pool Size=50 so there is pooling, with a max of 50 connections.

I run the following piece of code concurrently, 30 jobs at a time. I can see that there are 30 distinct process id values used (which makes me assume that a total of 30 connections are opened and reused) no matter how many times i run the code. So the pooling seems to work.

using (var connection = new NpgsqlConnection(connectionString))
{
    var insertIds = GetInsertIds(1000).ToList();

    await connection.OpenAsync();

    var transaction = connection.BeginTransaction();

    await connection.ExecuteAsync(@"
        CREATE TEMP TABLE tmp_data
        (
            insertid INTEGER NOT NULL
        );", transaction: transaction);

    const string copyItemSql = @"
        COPY tmp_data(
            insertid
        ) FROM STDIN (FORMAT BINARY)
    ";

    using (var writer = connection.BeginBinaryImport(copyItemSql))
    {
        foreach (var insertId in insertIds)
        {
            writer.StartRow();
            writer.Write(insertId, NpgsqlDbType.Integer);
        }
    }

    await connection.ExecuteAsync(@"
        INSERT INTO data (insertid)
            SELECT tmpd.insertid
            FROM tmp_data tmpd;
    ", transaction: transaction);

    transaction.Commit();
}

My assumptions where that:

  1. I would get an exception when a connection is taken from the pool and reused, because tmp_data already exists. This does not happen, the code runs just fine.
  2. I could work around this by doing CREATE TEMP TABLE IF NOT EXISTS tmp_data .... But then I would have the problem that since tmp_data already exists, there would still be rows from the previous insert. And I would be able to work around that by doing ON COMMIT DELETE ROWS.

So am I wrong about this? Isn't a connection in the pool the same thing as a session? Why doesn't the temp table exist the second time a connection is reused?

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
Joel
  • 8,502
  • 11
  • 66
  • 115
  • Hard to say without knowing your connection pool, but it is typical for connection pools to run `DISCARD ALL` when a session is returned, which would get rid of all temporary tables. – Laurenz Albe Jan 22 '19 at 08:38
  • @LaurenzAlbe Ok! Makes sense if that is the case then. I'm actually trying to fix this problem that you answered a while back: https://stackoverflow.com/questions/50366509/temporary-tables-bloating-pg-attribute. But if temporary tables are dropped after each insert, setting ON COMMIT DELETE ROWS wouldn't help in my case anyway right? (regarding the bloaded pg_attribute table). A new tmp table would be created for each insert anyway. – Joel Jan 22 '19 at 08:44

1 Answers1

3

I looked at Npgsql's implementation, and this is what runs when a connection is returned to the pool:

void GenerateResetMessage()
{
    var sb = new StringBuilder("SET SESSION AUTHORIZATION DEFAULT;RESET ALL;");
    var responseMessages = 2;
    if (DatabaseInfo.SupportsCloseAll)
    {
        sb.Append("CLOSE ALL;");
        responseMessages++;
    }
    if (DatabaseInfo.SupportsUnlisten)
    {
        sb.Append("UNLISTEN *;");
        responseMessages++;
    }
    if (DatabaseInfo.SupportsAdvisoryLocks)
    {
        sb.Append("SELECT pg_advisory_unlock_all();");
        responseMessages += 2;
    }
    if (DatabaseInfo.SupportsDiscardSequences)
    {
        sb.Append("DISCARD SEQUENCES;");
        responseMessages++;
    }
    if (DatabaseInfo.SupportsDiscardTemp)
    {
        sb.Append("DISCARD TEMP");
        responseMessages++;
    }

    responseMessages++;  // One ReadyForQuery at the end

    _resetWithoutDeallocateMessage = PregeneratedMessage.Generate(WriteBuffer, QueryMessage, sb.ToString(), responseMessages);
}

NpgsqlDatabaseInfo.SupportsDiscardTemp is set like this:

public virtual bool SupportsDiscardTemp => Version >= new Version(8, 3, 0);

So you get that behavior whenever you use a PostgreSQL version of at least 8.3.

You can avoid this reset using the connection string parameter No Reset On Close=true, as commented by Shay Rojansky below, but then you must yourself take care that you leave the session “clean”.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Ah. Thank you for looking into this! Well that explains the outcome of my tests then. On a related note. Would you say it's better to have a single permanent table where I insert and delete temporary data (used by my INSERT logic), instead of creating these temporary tables that bloat pg_cataglog tables? – Joel Jan 22 '19 at 10:12
  • If you have many concurrent sessions and each needs its own temporary table, a permanent table might be tricky. But other than that, it works just as fine. You could just `TRUNCATE` it before using it. – Laurenz Albe Jan 22 '19 at 10:23
  • Yes, I do have many concurrent sessions/transactions. So I would have to use some kind of "transactionid" to keep track of which rows corresponds to which transaction in the permanent table. My main goal is to avoid bloating the pg_catalog tables, which I am doing right now since I'm creating loads of temp tables (we do about 2-3 inserts per second) – Joel Jan 22 '19 at 10:30
  • Or you have several such tables. You could try `LOCK TABLE ... NOWAIT` until you find the first one that is free and use that in your transaction. – Laurenz Albe Jan 22 '19 at 10:36
  • Hm okey. I'll look into that as well. Thanks! – Joel Jan 22 '19 at 10:43
  • 1
    The above is correct - Npgsql tries to make pooled connection closing look like physical connection closing, to avoid leaking any state changes (like temp tables) across pooled connections. Note that it usually sends `DISCARD ALL` instead of the complicated query above which is only needed when there are prepared statements). – Shay Rojansky Jan 22 '19 at 14:37
  • 2
    Note that you can opt-out of this by setting `No Reset On Close=true` on the connection string - this means nothing is done when you close pooled connections, and your temp tables will continue existing. This can allow you to continue using a connection-specific temp table that isn't reset all the time. – Shay Rojansky Jan 22 '19 at 14:38
  • @ShayRojansky Thanks, I didn't see that. I took the liberty of adding it to the answer, since it is relevant. – Laurenz Albe Jan 22 '19 at 15:33