9

This is related to the previous question I asked, Saved View with a timestamp expression, about storing information in a (non-materialized) View. How would the data be stored and then retrieved when a user does:

CREATED MATERIALIZED VIEW mv AS SELECT person_id, name, NOW() as now FROM table
# is this more-or-less the same as:
# CREATED TABLE tb AS SELECT person_id, name, NOW() as now FROM table
#  "AND UPDATE EVERY..."

Is the NOW() expression saved as a value to storage, or are any functions evaluated at query-time for a materialized view? Is a materialized view the same thing as a table, which has some sort of optimizations/refreshing done at the storage level, or am I missing the boat on that?

This post here suggests that (functionally speaking at least) a Materialized View can be emulated as a table with triggers: https://www.materialized.info/.

David542
  • 104,438
  • 178
  • 489
  • 842
  • 1
    What database are you suing? – Gordon Linoff Oct 18 '20 at 22:52
  • @GordonLinoff well, mysql but that doesn't even have a materialized view -- so I'd say postgres or mssql. – David542 Oct 18 '20 at 23:08
  • 1
    According to [Wikipedia](https://en.wikipedia.org/wiki/Materialized_view), Postgres and MSSQL do them very differently. Postgres is much like you say (a table that is refreshed) whereas MS SQL's indexed views are much different - they are always up-to-date. I haven't used Postgres, but indexed views (I believe) don't allow you to include non-deterministic data like now(). – seanb Oct 18 '20 at 23:29
  • 1
    "now" is a fleeting moment which moves into history almost immediately. Theoretically it isn't possible to store "now", but you can store events close to "now" that happened just moments ago. So if you are considering this theoretical question all you need to answer is, do you need "now" or "when"? (e.g. when was it updated?). Using a function `now() ` in a select statement is an instruction to return the time when executed. Storing that point in time will no longer be "now" instead it will be "when". Attempting to use `now()` in a materialized view would be futile. – Paul Maxwell Oct 19 '20 at 02:12
  • @Used_By_Already I think using `now()` would be quite common in a query (ignoring whether the query is cached, a view, materialized, etc.) For example, just looking at your post above mine that says "17 hours ago" that is doing something like `now() - date_posted`. – David542 Oct 19 '20 at 20:10
  • 1
    @David542 of course it is very common indeed, I did not want to suggest it should not be used, I was merely trying to point out that as I now type this in, you will get to see it in the past. i.e. that storing "now" cannot really happen – Paul Maxwell Oct 20 '20 at 02:30
  • In SQL Server there's a whole bunch of limitations on indexed views. If I recall, using non deterministic functions is one of them. The view simply won't create. You're really posing a hypothetical situation here. If you were to actually try it you'd get an error. – Nick.Mc Oct 20 '20 at 23:48
  • `NOW()` is not a valid function in Oracle. If you want the equivalent then you want `SYSDATE`, `SYSTIMESTAMP` or `LOCALTIMESTAMP` instead. – MT0 Oct 21 '20 at 00:13
  • perd definition a materialued voew is a table that gets refreshed automatically, so basically it is onöly a short cut, for the table solution and has too many restictions – nbk Oct 23 '20 at 02:45
  • @nbk -- then why would it be used in the first place? – David542 Oct 23 '20 at 17:37
  • 1
    as you can see in your link in mysql you have to program all the function yourself, materialized views are useful if you need a complex select always up to date data, without you running the select again and again, also you can use indexes on the result(major upside), this makes the code automatically. Materialized views are quite resource hungry, but the upsides are worth it. but as far as i can see, the manual code and the materialzed code is almost identocal in his base form- but as the answers already tol you some rdms extended the functionality because it is so useful – nbk Oct 23 '20 at 17:47

5 Answers5

10

This will depend on the database platform you are using. Seeing as you tagged this with 'oracle', here is what happens in Oracle.

The "now" value (in this case 'sysdate') is evaluated at materialized view instantiation time. This is easy to demonstrate

SQL> create materialized view MV as select e.*, sysdate d from emp e;

Materialized view created.

SQL> select * from mv;

     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO D
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------------
      7369 SMITH      CLERK           7902 17/12/1980 00:00:00        800                    20 21/10/2020 12:18:26
      7499 ALLEN      SALESMAN        7698 20/02/1981 00:00:00       1600        300         30 21/10/2020 12:18:26
      7521 WARD       SALESMAN        7698 22/02/1981 00:00:00       1250        500         30 21/10/2020 12:18:26
      7566 JONES      MANAGER         7839 02/04/1981 00:00:00       2975                    20 21/10/2020 12:18:26
      7654 MARTIN     SALESMAN        7698 28/09/1981 00:00:00       1250       1400         30 21/10/2020 12:18:26
      7698 BLAKE      MANAGER         7839 01/05/1981 00:00:00       2850                    30 21/10/2020 12:18:26
      7782 CLARK      MANAGER         7839 09/06/1981 00:00:00       2450                    10 21/10/2020 12:18:26
      7788 SCOTT      ANALYST         7566 09/12/1982 00:00:00       3000                    20 21/10/2020 12:18:26
      7839 KING       PRESIDENT            17/11/1981 00:00:00       5000                    10 21/10/2020 12:18:26
      7844 TURNER     SALESMAN        7698 08/09/1981 00:00:00       1500                    30 21/10/2020 12:18:26
      7876 ADAMS      CLERK           7788 12/01/1983 00:00:00       1100                    20 21/10/2020 12:18:26
      7900 JAMES      CLERK           7698 03/12/1981 00:00:00        950                    30 21/10/2020 12:18:26
      7902 FORD       ANALYST         7566 03/12/1981 00:00:00       3000                    20 21/10/2020 12:18:26
      7934 MILLER     CLERK           7782 23/01/1982 00:00:00       1300                    10 21/10/2020 12:18:26

[wait 10 seconds]

14 rows selected.

SQL> select * from mv;

     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO D
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------------
      7369 SMITH      CLERK           7902 17/12/1980 00:00:00        800                    20 21/10/2020 12:18:26
      7499 ALLEN      SALESMAN        7698 20/02/1981 00:00:00       1600        300         30 21/10/2020 12:18:26
      7521 WARD       SALESMAN        7698 22/02/1981 00:00:00       1250        500         30 21/10/2020 12:18:26
      7566 JONES      MANAGER         7839 02/04/1981 00:00:00       2975                    20 21/10/2020 12:18:26
      7654 MARTIN     SALESMAN        7698 28/09/1981 00:00:00       1250       1400         30 21/10/2020 12:18:26
      7698 BLAKE      MANAGER         7839 01/05/1981 00:00:00       2850                    30 21/10/2020 12:18:26
      7782 CLARK      MANAGER         7839 09/06/1981 00:00:00       2450                    10 21/10/2020 12:18:26
      7788 SCOTT      ANALYST         7566 09/12/1982 00:00:00       3000                    20 21/10/2020 12:18:26
      7839 KING       PRESIDENT            17/11/1981 00:00:00       5000                    10 21/10/2020 12:18:26
      7844 TURNER     SALESMAN        7698 08/09/1981 00:00:00       1500                    30 21/10/2020 12:18:26
      7876 ADAMS      CLERK           7788 12/01/1983 00:00:00       1100                    20 21/10/2020 12:18:26
      7900 JAMES      CLERK           7698 03/12/1981 00:00:00        950                    30 21/10/2020 12:18:26
      7902 FORD       ANALYST         7566 03/12/1981 00:00:00       3000                    20 21/10/2020 12:18:26
      7934 MILLER     CLERK           7782 23/01/1982 00:00:00       1300                    10 21/10/2020 12:18:26

14 rows selected.

The "D" column is unchanged no matter how times you query the materialized view. If I now issue a refresh command for the materialized view, we are in effect re-running the defining query, hence sysdate (and hence column "D") will be picked up as the moment of refresh.

SQL> exec dbms_mview.refresh('MV')

PL/SQL procedure successfully completed.

SQL> select * from mv;

     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO D
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------------
      7369 SMITH      CLERK           7902 17/12/1980 00:00:00        800                    20 21/10/2020 12:19:12
      7499 ALLEN      SALESMAN        7698 20/02/1981 00:00:00       1600        300         30 21/10/2020 12:19:12
      7521 WARD       SALESMAN        7698 22/02/1981 00:00:00       1250        500         30 21/10/2020 12:19:12
      7566 JONES      MANAGER         7839 02/04/1981 00:00:00       2975                    20 21/10/2020 12:19:12
      7654 MARTIN     SALESMAN        7698 28/09/1981 00:00:00       1250       1400         30 21/10/2020 12:19:12
      7698 BLAKE      MANAGER         7839 01/05/1981 00:00:00       2850                    30 21/10/2020 12:19:12
      7782 CLARK      MANAGER         7839 09/06/1981 00:00:00       2450                    10 21/10/2020 12:19:12
      7788 SCOTT      ANALYST         7566 09/12/1982 00:00:00       3000                    20 21/10/2020 12:19:12
      7839 KING       PRESIDENT            17/11/1981 00:00:00       5000                    10 21/10/2020 12:19:12
      7844 TURNER     SALESMAN        7698 08/09/1981 00:00:00       1500                    30 21/10/2020 12:19:12
      7876 ADAMS      CLERK           7788 12/01/1983 00:00:00       1100                    20 21/10/2020 12:19:12
      7900 JAMES      CLERK           7698 03/12/1981 00:00:00        950                    30 21/10/2020 12:19:12
      7902 FORD       ANALYST         7566 03/12/1981 00:00:00       3000                    20 21/10/2020 12:19:12
      7934 MILLER     CLERK           7782 23/01/1982 00:00:00       1300                    10 21/10/2020 12:19:12

14 rows selected.

SQL>

But each platform may have its own characteristics.

Treatment of Oracle materialised views including definition and materialised view logs is covered here

https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/basic-materialized-views.html#GUID-A7AE8E5D-68A5-4519-81EB-252EAAF0ADFF

and some of the more advanced topics (partitioning, indexing, etc) of materialized views and how they can pertain to automatic query rewrite is here

https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/advanced-materialized-views.html#GUID-F7394DFE-7CF6-401C-A312-C36603BEB01B

Connor McDonald
  • 10,418
  • 1
  • 11
  • 16
  • thanks for this answer. What then would be the advantage of using a `mv` vs a normal table in Oracle? Is it to have easy-access to the `exec dbms_mview.refresh('MV')` command? and are materialized views frequently used in Oracle? – David542 Oct 21 '20 at 04:53
  • @Nick.McDermaid thanks could you provide a link to that then? I don't have Oracle installed. – David542 Oct 21 '20 at 18:05
  • 1
    Its a big topic :-) https://docs.oracle.com/en/database/oracle/oracle-database/19/dwhsg/part-optimizing-dw.html#GUID-9A97B601-4BD2-40B6-81F5-44F679F82C6F – Connor McDonald Oct 22 '20 at 05:17
  • @ConnorMcDonald thanks for this answer and taking the time. Would you want to add just a few links to things like the mv, mvlog for oracle so it'll be a good reference for people that may come to this in the future? – David542 Oct 27 '20 at 23:08
4

I have used Materialized views extensively in Oracle and I can answer from that point of view. I would imagine the general principle would be same for other databases as well with tiny variations.

First off, Materialized views are physical storage compared to normal views. Normal views stores only the query and execute the queries at run time. So, if you stored a current timestamp in Mat view it would show you the old value unless you refresh the view. In oracle, there are several refresh strategies for Materialized views as described below -

1. Manual Refresh: MView can be refreshed on demand by using the standard package dbms_snapshot.refresh_mview

2. Automatic Refresh: MView can be refreshed as soon as any changes are made in the table underlying the MView using “On Commit”

Automatic refreshes can be of various types -

2.1. Complete(Full) – Whenever the base table is modified, MView will be truncated first and will be loaded with the data. As the name suggests, it is refreshed completely.

2.2. Fast – Whenever the base table is modified, only the updated records are updated/inserted in the MView. We need “mvlog” file to make “Fast” refresh.

2.3. Force – It will first try to do the “Fast” refresh. If for some reasons, “Fast” refresh fails then it will do the “Complete” refresh.

So, structurally Materialized view is exactly similar to a physical table. Then the question is why do we create MV? & what's the advantage of creating a Materialized view after all? Well, here comes the interesting part.

Typically, when base tables contain large amount of data, it is expensive and time-consuming to compute the required aggregates or to compute joins between these tables. In such cases, queries can take minutes or even hours. So, in materialized views one can store precomputed aggregates and joins and when one tries to execute aggregate SQL queries or use those same joins in the queries, the database engine would actually not execute the query at runtime instead it will fetch the pre-computed results stored on Mat view and get back those results to the client, which is much faster as it prevents those computations to happen at runtime. So, how does the engine able to do that? - Oracle Database employs an extremely powerful process called query rewrite to quickly answer the query using materialized views.

A query undergoes several checks to determine whether it is a candidate for query rewrite. If the query fails any of the checks, then the query is applied to the detail tables rather than the materialized view. This can be costly in terms of response time and processing power.

The optimizer uses two different methods to recognize when to rewrite a query in terms of a materialized view. The first method is based on matching the SQL text of the query with the SQL text of the materialized view definition. If the first method fails, the optimizer uses the more general method in which it compares joins, selections, data columns, grouping columns, and aggregate functions between the query and materialized views.

This infact is a very useful application of Materialized view and can speed up queries 5X -100X times. I can give you a real life example, on one of the projects that I worked on, there were several aggregate reports which needed to be prepared weekly and if we used aggregate queries on top of data warehouse tables, it used to take hours for us to build those reports. Instead what we did later on was after every weekly batch of data warehouse load we used to do a full refresh of the Mat views (using the automatic refresh feature) say every Friday night and enable query re-write on those materialized views (having pre-aggregated/pre-computed results). Then our reporting queries used to be kick-off over the weekend through an automated process and due to this optimization of pre-aggerated results in Mat views with query re-write feature, our reports build time got a lot more faster and we were able to validate and deliver all the reports to the stakeholders by Monday morning without any hiccups whereas we were always on our toes when we had the previous process in place without Mat views, since the build used to take hours/sometimes days depending on the report. We similarly used the same process for non-aggregated reports as well which involved lots of joins with dimension and facts tables. So, in those Materialized views we used to store the data in a de-normalized format so that for building reports the joins need not be performed at run time. The pre-aggregation optimization yielded us 50x- 100x gains while the de-normalized materialized views used to give us anywhere between 5x - 30x depending on several factors.

You can read up the implementation & syntax level details in Oracle docs. Would be happy to provide more details if needed but just wanted to share a real-life 1st-hand Use-case to get the point across how useful it can be if we are able to make use of it in the right scenario. Below is a sample syntax in Oracle for reference.

CREATE MATERIALIZED VIEW department_mv
refresh complete on commit --Automatic Complete refresh (as described above)
enable query rewrite -- this feature enables optimization (as described above)
as
SELECT deptno, dept_name, SUM(salary)
FROM department
GROUP BY deptno, dept_name;
Somy
  • 1,474
  • 1
  • 4
  • 13
  • 1
    @David542 - Let me know if my response helps answer your question to an extent. Would be curious to hear thoughts/comments. – Somy Oct 24 '20 at 17:02
3

In simple words: Materialized view in SQL is a physical constructs which is stored physically on the disc. But views are just logical constructs which are created where it is required in the query..

When you create a Materialized view, Oracle Database creates one internal table and at least one index, and may create one view, all in the schema of the materialized views.

Following is the syntax of materialized view:

Create materialized view View_Name

Build [Immediate/Deffered]

Refresh [Fast/Complete/Force]

on [Commit/Demand]

as Select ..........;

when a DML is executed in the Master table ,then oracle stores rows defining changes in the MV log and uses it to refresh the MV .Its called fast refresh.You can check the behaviour in mlog$_tablename. MV's are also called snapshots. using select name,table_name,refresh_method from user_snapshots, you can check the statuses.

To reflect the changes of master table in MV: execute DBMS_MVIEW.REFRESH('name_of the view');

for single snapshot: execute DBMS_SNAPSHOT.REFRESH( 'v_materialized_foo_tbl','f');

REFRESH procedure Below procedure refreshes a list of snapshots.

Syntax

DBMS_SNAPSHOT.REFRESH (
   { list                 IN     VARCHAR2,
   | tab                  IN OUT DBMS_UTILITY.UNCL_ARRAY,},...);

list: Comma-separated list of snapshots that you want to refresh. tab :A string of refresh methods indicating how to refresh the listed snapshots. F' or f' indicates fast refresh, ?' indicates force refresh, C' or c' indicates complete refresh, and A' or `a' indicates always refresh,

The picture below shows general idea: enter image description here

So, NOW()'s value is called when you do refreshes on it ,through execute DBMS_MVIEW.REFRESH() or DBMS_SNAPSHOT.REFRESH

Moreover using triggers on a MV is not a good practice though mysql can .Be it on read only MV. Oracle does not support a Execute command(to refresh MV) inside a trigger because trigger is itself a single transaction and exceute command commit in it makes it two commits inside one transaction.

Does an insert trigger need a commit statement

Triggers on a read only materialized view might appear to work - but you cannot rely on them working.

An update of a row in a materialized view might be done as a DELETE+INSERT.

A refresh of a materialized view might involve a delete + insert of every row.

A refresh might involve a truncate plus direct path load of ever row.

(the last two will probably happen sometime, you would lose anything your trigger had done in the past)"--- https://asktom.oracle.com/pls/apex/asktom.search%3Ftag%3Dtriggers-on-materialized-views#:~:text=Triggers%20on%20a%20read%20only,delete%20%2B%20insert%20of%20every%20row.

Nikhil S
  • 3,786
  • 4
  • 18
  • 32
2

The indexed view (or so called materialized view) can be changed with normal table and triggers.

You know that in PostgreSQL one needs to refresh it if the data of underlying table is changed. In Microsoft SQL Server an indexed view is refresh automatically, but there are many requirements to fulfill to create index on a view.

If you have two tables referred in a indexed view you will need to triggers on each table in order to use the alternative solution. Also, it's common that people write triggers which do not perform fast (for example a common mistake is one to use Row By Agonizing Row instead batch processing).

So, it seems that the indexed view is there for making our live easier but clearly it can be replace with a table and triggers to others table as the last solution is not facing the limitations of the first but it is more complicated.

For me, the PostgreSQL implementation is useless us I want to my statistics to be correct once a transaction is committed, not when a routine is called to refresh them.

I have used various indexed views in the context of SQL Server in order to optimize different cases. I have used triggers to precalculated data for the same purposes, too. I doubt there is definite answer which one to use. For me, if you can use an indexed view - use it and leave the engine to handle the difficult stuff. If you are facing some RDMS limitations - you have no choice but to use a triggers to precalculated the data.

In the context of SQL Server, the indexed view maintenance cost is similar to the cost of having an index.

buddemat
  • 4,552
  • 14
  • 29
  • 49
gotqn
  • 42,737
  • 46
  • 157
  • 243
1

A materialized view is a view that has been materialized - ie pre evaluated and written to disk/memory. Its valuable in comparison to a view because it allows fast and efficient access to data.

For instance if you have large complex queries with millions of rows it may take some time to join the tables and run aggregate or analytic functions etc. For an online system users tend to get bored if they have to wait 10 seconds. A MV allows this to be pre-processed. (Maybe overnight, maybe every few hours etc)

This could of course be done with a procedure to copy data to a table. However MVs are set up for this purpose and have many smart optimisations such as only updating/inserting rows in the MV that have changed in the underlying tables...this may mean that the MV can be updated almost instantly as needed. These extra features may make them more suitable then writing the logic in a procedure or trigger...but sometimes the procedure is still a good way to go!

So yes something very like a MV could be written with triggers/procedures but it's likely an unnecessarily complicated way to do thing when Oracle has built in support for MVs.

Emu
  • 494
  • 6
  • 15