0

I have to decide how I want to implement dynamic schema. I will try to put some more description where my problem is.

I use the postgres DB in latest version and app is written in Ruby and Hanami. I have requirement to implement view where I will display list of users in our application (table view). Here comes first requirement that it needs to contain dynamic data. For example user has name and birth_date and I store this data in Users table. In other table I store metrics related to user like number of views and comments. This is only part of data. I have couple more tables related to my User and I need to join them into one view. It doesn't sound very dynamic. I know. But I have another requirement. I want to allow users to add custom column to my table view.

At this point we have table with following predefined columns:

name, birth_date, number_of_views, number_of_comments 

and we can add some custom ones.

Now It becomes dynamic. It is not the end. I need to add filtering on this columns. So for example if given user has columns I mentioned earlier I should allow to filter on all of them (predefined and custom). I am trying to investigate how I should implement this feature.

Now the question part :) I am thinking about how I should store this data. I am thinking about storing it as a JSONB column in my postgres. Scale of this is about 5kk of users. So this mean 5kk rows in this table. Not so much. I am wondering about filtering this data using dynamic filters. I tried to make some benchmarks for 5kk table and it worked well if I used pagination (without indexes). For querying whole dataset it took like 10sec so wasn't perfect.

Questions

  1. Is JSONB good idea? I am not sure If I can define proper indexes if I will filter dynamically on this data (every query can be different). How to define indexes for this scenario?
  2. Is a good idea to store predefined columns as normal columns in table and only custom ones in JSONB? Will it give me any profit?
  3. I should I use postgres to that? I don't believe that postgres isn't good enough for this scale and I prefer to not change tool.

Thanks for all answers. I don't have huge experience with dynamic schemas so every tip can be usefull ;P

dewastator
  • 213
  • 4
  • 13
  • 1
    You should not alter schema. You should rethink the problem (think vertical not horizontal). We have plenty of "Free Form" products with attributes. these are stored in an attribution table which can be as simple as product_id, attribute_name, attribute_value and attribute_type (for casting as needed). – engineersmnky Aug 08 '17 at 20:50
  • Can you extend your comment? What do you mean that I should not alter schema? – dewastator Aug 08 '17 at 21:12
  • 1
    What I meant is a dynamic schema is unpredictable. You could store the additional data as Json but this too has limits and sorting seems far more difficult than using a true table with a one to many relationship – engineersmnky Aug 08 '17 at 23:40

2 Answers2

1

I would recommend using NoSQL database like mongoDB. Your problem statement requires schema-less document store supporting fast reads ie. MongoDB or Couchbase which has great support

0
    -- DROP FUNCTION IF EXISTS padrao.func_create_schema_and_table(text);

CREATE OR REPLACE FUNCTION padrao.func_create_schema_and_table(
    sufixo_schema text)
    RETURNS boolean
    LANGUAGE 'plpgsql'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
declare
    sql_cod text;
begin  
    --STRING QUE ARMAZENA O CÓDIGO QUE SERÁ EXECUTADO
    sql_cod = '
    
----------------------------------- CRIAÇÃO SCHEMA ----------------------------------------  
    
CREATE SCHEMA IF NOT EXISTS schema_'|| sufixo_schema ||' AUTHORIZATION pitjndjzwxxx;
    
---------------------------------- CRIAÇÃO DAS TABELAS ------------------------------------

CREATE TABLE IF NOT EXISTS schema_'|| sufixo_schema ||'.produtos (
    "pk_id_produto" serial NOT NULL,
    "fk_software_user" serial NOT NULL,
    "produto_codigo" varchar(10) NOT NULL UNIQUE,
    "produto_descricao" varchar(255) NOT NULL,
    "produto_atualizado" varchar(20) NOT NULL,
    "produto_situacao" varchar(20),
    "produto_dados" jsonb,
    PRIMARY KEY ("pk_id_produto"),
    UNIQUE (produto_codigo,fk_software_user)
); 
/*----------------------------------------------------------------------------------------------*/  
    ';
    
    --EM CASO DE ERRO SERÁ RETORNADO A MENSAGEM ABAIXO
    raise notice '### ERRO NA EXECUÇÃO DA FUNÇÃO POSTGRES ###';
    
    --EXECUÇÃO DO CÓDIGO QUE FOI SALVO NA VARIÁVEL:sql_cod 
    EXECUTE format(sql_cod, sufixo_schema);

    return true; 
          
END;
$BODY$;

ALTER FUNCTION padrao.func_create_schema_and_table(text)
    OWNER TO pitjndjzwqrxxx;
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Mario Mateaș Feb 15 '23 at 04:17