0

I'm learning indexing in PostgreSQL now. I started trying to create my index and analyzing how it will affect execution time. I created some tables with such columns:

enter image description here

enter image description here

also, I filled them with data. After that I created my custom index:

create index events_organizer_id_index on events(organizer_ID);

and executed this command (events table contains 148 rows):

explain analyse select * from events where events.organizer_ID = 4;

I was surprised that the search was executed without my index and I got this result:

enter image description here

As far as I know, if my index was used in search there would be the text like "Index scan on events". So, can someone explain or give references to sites, please, how to use indexes effectively and where should I use them to see differences?

Dark_Phoenix
  • 368
  • 3
  • 14

2 Answers2

3

From "Rows removed by filter: 125" I see there are too few rows in the events table. Just add couple of thousands rows and give it another go

from the docs

Use real data for experimentation. Using test data for setting up indexes will tell you what indexes you need for the test data, but that is all.

It is especially fatal to use very small test data sets. While selecting 1000 out of 100000 rows could be a candidate for an index, selecting 1 out of 100 rows will hardly be, because the 100 rows probably fit within a single disk page, and there is no plan that can beat sequentially fetching 1 disk page.

In most cases, when database using an index it gets only address where the row is located. It contains data block_id and the offset because there might be many rows in one block of 4 or 8 Kb.

So, the database first searches index for the block adress, then it looks for the block on disk, reads it and parses the line you need.

When there are too few rows they fit into one on in couple of data blocks which makes it easier and quicker for DB to read whole table without using index at all.

ekochergin
  • 4,109
  • 2
  • 12
  • 19
0

See it the following way: The database decides which way is faster to find your tuple (=record) with organizer_id 4. There are two ways: a) Read the index and then skip to the block which contains the data. b) Read the heap and find the record there.

The information in your screenshot show 126 records (125 skipped + your record) with a length ("width") of 62 bytes. Including overhead these data fit into two database blocks of 8 KB. As a rotating disk or SSD reads a series of blocks anyway - they read always more blocks into the buffer - it's one read operation for these two blocks.

So the database decides that it is pointless to read first the index to find the correct record (of in our case two blocks) and then read the data from the heap with the information from the index. That would be two read operations. Even with modern technology newer than rotating disks this needs more time than just scanning the two blocks. That's why the database doesn't use the index.

Indexes on such small tables aren't good for searching. Nevertheless unique indexes avoid double entries.