The documentation is quite clear:
materialize
Allows caching a subquery result during the time of query execution in
a way that other subqueries can reference the partial result.
views
Views are virtual tables based on the result-set of a Kusto Query
Language query. Just like a real table, a view contains rows and
columns. Unlike a real table, a view doesn't hold its own data
storage.
Views are defined through user-defined functions with the following
requirements:
- The result of the function must be tabular (for example, it cannot be a scalar value).
- The function must take no arguments.
The view keyword
By default, operators that support a wildcard syntax to specify table
names will not reference views, even if the view's name matches the
wildcard. An example of this type of operator is the union operator.
In this case, use the view keyword to have the view included as well.
Materialized views
Materialized views expose an aggregation query over a source table, or
over another materialized view.
As the name suggests, the aggregation results are materialized, meaning - stored.
The results are being updated constantly, while the data keeps being ingested.
In your case, there seems to be no reason to use materialize()
, nor the view
keyword.
A materialized view such as the following (with filter on CustomerId) might improve performance when used instead of table1
.
.create materialized-view Customer_mv on table Customer
{
Customer
| summarize by CustomerId, City
}
Update
Here are couple of examples that demonstrate the benefits of materialize()
1.
let t = print x = rand(1000);
union t, t, t
Fiddle
vs.
let t = materialize(print x = rand(1000));
union t, t, t
Fiddle
2.
let t1 = range i from 1 to 100000000 step 1 | summarize count() by i = i%2;
let t2 = t1;
t1
| join kind=inner t2 on i
i |
count_ |
i1 |
count_1 |
1 |
50000000 |
1 |
50000000 |
0 |
50000000 |
0 |
50000000 |
Execution Time: 4.4375515
Fiddle
vs.
let t1 = materialize(range i from 1 to 100000000 step 1 | summarize count() by i = i%2);
let t2 = t1;
t1
| join kind=inner t2 on i
i |
count_ |
i1 |
count_1 |
1 |
50000000 |
1 |
50000000 |
0 |
50000000 |
0 |
50000000 |
Execution Time: 2.5313002
Here is an example that demonstrates the benefits of a view
.create-or-alter function StormEvents_top_5_deaths_v ()
{
cluster("help").database("Samples").StormEvents
| project Duration = EndTime - StartTime
,Distance = round(geo_distance_2points(BeginLon, BeginLat, EndLon, EndLat))
,TotalDeath = DeathsDirect + DeathsIndirect
,TotalInjuries = InjuriesIndirect + InjuriesIndirect
,TotalDamage = DamageCrops + DamageProperty
| where TotalDeath > 0
| top 5 by TotalDeath
}
StormEvents_top_5_deaths_v
Duration |
Distance |
TotalDeath |
TotalInjuries |
TotalDamage |
20.07:00:00 |
0 |
14 |
0 |
0 |
00:18:00 |
20609 |
13 |
0 |
46000000 |
01:02:00 |
26080 |
11 |
0 |
250000000 |
2.20:00:00 |
0 |
10 |
0 |
0 |
18.09:00:00 |
0 |
10 |
0 |
0 |
Fiddle