I have 3 Mysql tables:
[block_value]
- id_block_value
- file_id
[metadata]
- id_metadata
- metadata_name
[metadata_value]
- meta_id
- value
- blockvalue_id
In these tables, there are pairs: metadata_name
= value
And list of pairs are put in blocks (id_block_value
)
(A) If I want height = 1080:
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080");
+---------+
| file_id |
+---------+
| 21 |
| 22 |
(...)
| 6962 |
(...)
| 8146 |
| 8147 |
+---------+
794 rows in set (0.06 sec)
(B) If I want file extension = mpeg:
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg");
+---------+
| file_id |
+---------+
| 6889 |
| 6898 |
| 6962 |
+---------+
3 rows in set (0.06 sec)
BUT, if I want:
- A and B
- A or B
- A and not B
Then, I don't know what is the best.
For A or B
, I tried A union B
which seems to do the trick.
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
UNION
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg");
+---------+
| file_id |
+---------+
| 21 |
| 22 |
| 34 |
(...)
| 6889 |
| 6898 |
+---------+
796 rows in set (0.13 sec)
For A and B
, since there are no intersect
in Mysql, I tried A and file_id in(B)
, but look at perfs (>4mn)...
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080")
and file_id in(
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg"));
+---------+
| file_id |
+---------+
| 6962 |
+---------+
1 row in set (4 min 36.22 sec)
I tried B and file_id in(A)
too, which is a lot better, but I will never know how which one to put first.
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "file extension" and value = "mpeg")
and file_id in(
SELECT DISTINCT file_id
FROM metadata_value MV
INNER JOIN metadata M ON MV.meta_id = M.id_metadata
INNER JOIN block_value BV ON MV.blockvalue_id = BV.id_block_value
WHERE (metadata_name = "height" and value = "1080"));
+---------+
| file_id |
+---------+
| 6962 |
+---------+
1 row in set (0.75 sec)
So... what do I do now? Is there any better way for boolean operations? Any tip? Did I miss something?
EDIT: what data looks like:
This database contains a row in FILE
table for each audio/video file inserted:
- 10, /path/to/file.ts
- 11, /path/to/file2.mpeg
There is a row in METADATA
table for each potential information:
- 301, height
- 302, file extension
Then, a row in BLOCK
table define a container:
- 101, Video
- 102, Audio
- 104, General
A file can have several blocks of metadata, a BLOCK_VALUE
table contains instances of BLOCKS:
- 402, 101, 10 // Video 1
- 403, 101, 10 // Video 2
- 404, 101, 10 // Video 3
- 405, 102, 10 // Audio
- 406, 104, 10 // General
In this example, file 10 has 5 blocks: 3 Video (101) + 1 Audio (102) + 1 General (104)
Values are stored in METADATA_VALUE
- 302, 406, "ts" // file extension, General
- 301, 402, "1080" // height, Video 1
- 301, 403, "720" // height, Video 2
- 301, 404, "352" // height, Video 3