2

Let's say I am programming a MMORPG. I modeled a Entity character that can have a multitude of attributes like coating, strength, color and so on. Because I do not know these attributes in advance (what and how many of them), I thought I create an extra table for it, like so:

CREATE TABLE character (INTEGER id, VARCHAR name, INTEGER player_id);

and

CREATE TABLE attributes (INTEGER character_id, VARCHAR key, VARCHAR value);

I would then be able to introduce a multitude of new attributes. However, how would I query this construct? The query

SELECT * FROM character JOIN attributes ON character.id=attributes.character_id;

will obvioulsy only work for a single attribute. Do I have to JOIN the attributes table more than once or is there another solution?

Is there a way to have different types for the attribute.value Part? Doing it the way I am doing now would limit me to a VARCHAR representation.

AME
  • 2,499
  • 5
  • 29
  • 45
  • 1
    You are after an EAV model, check this wikipedia link [EAV model](http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model). Magento uses this model extensively, here another link [Magento EAV model](http://www.magentocommerce.com/wiki/2_-_magento_concepts_and_architecture/magento_database_diagram) – Yaroslav Sep 27 '12 at 09:52

2 Answers2

4

Another possibility is to use hstore instead of EAV model.

CREATE TABLE character (id INTEGER, name VARCHAR,
                        player_id INTEGER, attributes hstore);

That way you can store the attributes as a map (key - value).

insert into character (id, name, player_id, attributes)
values (1, 'test', 1, '"attribute1"=>"value1","attribute2"=>"value2"')
      ,(2, 'test', 1, '"attribute1"=>"value1","attribute3"=>"value3"');

select (each(attributes)).key, (each(attributes)).value 
from character where id = 1;

  key text    value text
  --------------------------
   attribute1   value1
   attribute2   value2

select id, attributes->'attribute3' as value 
from character WHERE exist(attributes,'attribute3');

  id    value
  ---------------
   2   "value3"

Hope this helps.

UPDATE

I made a small benchmark to compare hstore vs two tables.

CREATE OR REPLACE FUNCTION create_dummy_data()
RETURNS integer AS
$BODY$
DECLARE
   cont1       INTEGER;
   cont2       INTEGER;
   sqlInsert   VARCHAR;

BEGIN
   CREATE TABLE character (id INTEGER PRIMARY KEY
                          ,name VARCHAR
                          ,player_id INTEGER);

   CREATE TABLE attributes (character_id INTEGER
                           ,key VARCHAR
                           ,value VARCHAR
                           ,FOREIGN KEY (character_id) REFERENCES character);

   cont1 := 1;
   WHILE cont1 < 10000 LOOP
      sqlInsert := 'INSERT INTO character (id, name, player_id) VALUES (' || cont1 || ', ''character' || cont1 || ''', ' || cont1 || ');';
      EXECUTE sqlInsert;
      cont1 := cont1 + 1;
   END LOOP;

   cont1 := 1;
   WHILE cont1 < 10000 LOOP
      cont2 := 1;
      WHILE cont2 < 10 LOOP   
         sqlInsert := 'INSERT INTO attributes (character_id, key, value) VALUES (' || cont1 || ', ''key' || cont2 || ''', ' || cont2 || ');';
         EXECUTE sqlInsert;
         cont2 := cont2 + 1;
      END LOOP;       
      cont1 := cont1 + 1;
    END LOOP;

    CREATE TABLE character_hstore (id INTEGER
                                  ,name VARCHAR
                                  ,player_id INTEGER
                                  ,attributes hstore);
    cont1 := 1;
    WHILE cont1 < 10000 LOOP
       sqlInsert := 'INSERT INTO character_hstore (id, name, player_id, attributes) VALUES (' || cont1 || ', ''character' || cont1 || ''', ' || cont1 || ', ''"key1"=>"1","key2"=>"2","key3"=>"3","key4"=>"4","key5"=>"5"'');';
       EXECUTE sqlInsert;
       cont1 := cont1 + 1;
    END LOOP;   

    RETURN 1;
 END;
 $BODY$
 LANGUAGE plpgsql;

 select * from create_dummy_data();

 DROP FUNCTION create_dummy_data();

And I've got the following results:

explain analyze
SELECT ca.* 
FROM character ca
JOIN attributes at ON ca.id = at.character_id
WHERE at.value = '1';
"Hash Join  (cost=288.98..2152.77 rows=10076 width=21) (actual time=2.788..23.186 rows=9999 loops=1)"

CREATE INDEX ON attributes (value);

explain analyze
SELECT ca.* 
FROM character ca
JOIN attributes at ON ca.id = at.character_id
WHERE at.value = '1';
"Hash Join  (cost=479.33..1344.18 rows=10076 width=21) (actual time=4.330..13.537 rows=9999 loops=1)"

And using hstore:

explain analyze
SELECT * 
FROM character_hstore
WHERE attributes @> 'key1=>1';
"Seq Scan on character_hstore  (cost=0.00..278.99 rows=10 width=91) (actual time=0.012..3.530 rows=9999 loops=1)"

explain analyze
SELECT * 
FROM character_hstore
WHERE attributes->'key1' = '1';
"Seq Scan on character_hstore  (cost=0.00..303.99 rows=50 width=91) (actual time=0.016..4.806 rows=9999 loops=1)"
doctore
  • 3,855
  • 2
  • 29
  • 45
  • Thanks! How would I query for all characters having a value for an attribute. For example how could I find all charaters having `coating = fur` efficiently? – AME Sep 27 '12 at 11:26
  • 2
    `select * from charaters where attributes @> 'coating=>fur'` or `select * from charaters where attributes->'coating' = 'fur'`. It's also a column type that allows indices (see F.13.3 section of hstore link) – doctore Sep 27 '12 at 11:44
  • Cool. so could you give an estimate, how that compares speedwise to normal columns? I have the suspicion that it might take drastically more time...(?) – AME Sep 27 '12 at 12:09
  • Thank you very much! The conclusion seems to be that hstore is the way to go, right? :) – AME Sep 28 '12 at 12:11
  • In my opinion, in your case it's a better option than EAV model because you stored the information as key - value, and you don't need any referential integrity as: foreign keys, check constraints, etc. in `attributes` table. – doctore Sep 28 '12 at 12:28
2

What you are describing is an Entity-Attribute-Value model. Google EAV, you'll find lots of information about it.

And yes querying is the big drawback of this kind of query. Selects get very large, complex and possibly slow.

As @doctore describes there are database specific features available:

CREATE TABLE character (INTEGER id, VARCHAR name,
                    INTEGER player_id, attributes hstore);

You can read more about hstore, follow the link.

If this entity is central to your application you might want to considere an dedicated key value store for this kind of data. For example CouchDB

Jens Schauder
  • 77,657
  • 34
  • 181
  • 348