My application allows users to collect measurement data as part of an experiment, and needs to have the ability to report on all of the measurements ever taken.
Below is a very simplified version of the tables I have:
CREATE TABLE EXPERIMENTS(
EXPT_ID INT,
EXPT_NAME VARCHAR2(255 CHAR)
);
CREATE TABLE USERS(
USER_ID INT,
EXPT_ID INT
);
CREATE TABLE SAMPLES(
SAMPLE_ID INT,
USER_ID INT
);
CREATE TABLE MEASUREMENTS(
MEASUREMENT_ID INT,
SAMPLE_ID INT,
MEASUREMENT_PARAMETER_1 NUMBER,
MEASUREMENT_PARAMETER_2 NUMBER
);
In my database there are 2000 experiments, each of which has 18 users. Each user has 6 samples to measure, and would do 100 measurements per sample.
This means that there are 2000 * 18 * 6 * 100 = 21600000 measurements currently stored in the database.
I'm trying to write a query that will get the AVG() of measurement parameter 1 and 2 for each user - that would return about 36,000 rows.
The query I have is extremely slow - I've left it running for over 30 minutes and it doesn't come back with anything. My question is: is there an efficient way of getting the averages? And is it actually possible to get results back for this amount of data in a reasonable time, say 2 minutes? Or am I being unrealistic?
Here's (again a simplified version) the query I have:
SELECT
E.EXPT_ID,
U.USER_ID,
AVG(MEASUREMENT_PARAMETER_1) AS AVG_1,
AVG(MEASUREMENT_PARAMETER_2) AS AVG_2
FROM
EXPERIMENTS E,
USERS U,
SAMPLES S,
MEASUREMENTS M
WHERE
U.EXPT_ID = E.EXPT_ID
AND S.USER_ID = U.USER_ID
AND M.SAMPLE_ID = S.SAMPLE_ID
GROUP BY E.EXPT_ID, U.USER_ID
This will return a row for each expt_id/user_id combination and the average of the 2 measurement parameters.