5

Example (fake) case: indexing a column containing e-mails (text type) when the column will exclusively be queried for an exact string match, ex.

SELECT * FROM mytable WHERE email = 'test@test.com'

Do hash indexes provide advantages over B-TREEs with no drawbacks in these cases?

Do they affect the performance of inserts/updates differently?

(edit: and never ordered by this column)

Z. M.
  • 329
  • 5
  • 13
  • this is good question. nowaday many cases are analytic workload which are write-once table and simple match query for text. the *theoretical* optimal index type for which is Hash for O(1). – WeiChing 林煒清 Jan 03 '22 at 14:08

2 Answers2

4

The only clear advantage is when the text string to be indexed can be quite long. Then the hash index can be much smaller and therefore better cached. In the extreme, inserting values longer than about 2700 bytes can error-out altogether with btree indexes, but will still work with hash indexes.

With shorter strings, hash indexes have more disadvantages than advantages. In addition to not supporting ordering, they also don't support unique constraints (and thus can't be on the receiving end of foreign keys) and in the non-unique side they tend to become unbalanced if some values are very much more common than other values. Also, being less used means they are less well-tested and so more likely to have undiscovered bugs; and less work has gone into optimizing them.

Also, I think the metapage of hash indexes is still a point of contention for insertions, which can lower performance.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • Can I assume from your answer that it is not work creating an index of you are only doing exact match queries? Like the OP, my text data is not long (< 20 characters) – GraSim May 23 '22 at 18:42
1

if you are only comparing email column with equal operator and there is no ordering , then hash index is much faster.

  • B-trees can handle equality and range queries on data that can be sorted into some ordering. In particular, the PostgreSQL query planner will consider using a B-tree index whenever an indexed column is involved in a comparison using one of these operators: <, <=, =, >=, >

  • Hash indexes can only handle simple equality comparisons. The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the = operator.

here is a comparison between two index types in mysql docs but applies to postgresql as well MySQl Comparison of B-Tree and Hash Indexes

eshirvana
  • 23,227
  • 3
  • 22
  • 38