Query optimizers typically use summaries of data distributions to estimate the sizes of the intermediate tables generated during query processing. One popular such summarization scheme is a histogram, whereby the input range is partitioned into buckets and a cumulative count is maintained of the number of tuples falling in each bucket. The distribution within a bucket is assumed to be uniform for the purposes of estimation.
The following shows one such histogram for a relation R
on a discrete attribute a with domain [1..10]
:
Bucket 1: range = [1..2] Cumulative tuple count = 6
Bucket 2: range = [3..8] Cumulative tuple count = 30
Bucket 3: range = [9..10] Cumulative tuple count = 10
What is the estimated size of the self-join operation R x R
A) 46
B) 218
C) 248
D) 1,036
E) 5,672
Answer given in solutions : B
How is the answer to be calculated?