84

Ok, here is my dilemma I have a database set up with about 5 tables all with the exact same data structure. The data is separated in this manner for localization purposes and to split up a total of about 4.5 million records.

A majority of the time only one table is needed and all is well. However, sometimes data is needed from 2 or more of the tables and it needs to be sorted by a user defined column. This is where I am having problems.

data columns:

id, band_name, song_name, album_name, genre

MySQL statment:

SELECT * from us_music, de_music where `genre` = 'punk'

MySQL spits out this error:

#1052 - Column 'genre' in where clause is ambiguous

Obviously, I am doing this wrong. Anyone care to shed some light on this for me?

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Jayrox
  • 4,335
  • 4
  • 40
  • 43

6 Answers6

181

I think you're looking for the UNION clause, a la

(SELECT * from us_music where `genre` = 'punk')
UNION
(SELECT * from de_music where `genre` = 'punk')
Mihai Limbășan
  • 64,368
  • 4
  • 48
  • 59
  • @mihai-limban - sorry to disturb you, but is there a way to recognise from the result set that "which result came from which table". Beacuse, if we need to update/delete a record from this result set, there is no way to know. – web-nomad Apr 10 '12 at 13:17
  • 8
    @Pushpesh add an unique string identifier to every `SELECT`, e.g.: `(SELECT 'us_music' AS from_table, * FROM us_music WHERE genre = 'punk') UNION ...` – jkrcma Jun 11 '12 at 14:19
  • What is the value of genre is unknown but ids should match in two tables? Can you do something like this? `(SELECT 1) AS select1 UNION (SELECT 2) AS select2 WHERE select1.id=select2.id` – ZurabWeb Dec 17 '13 at 18:29
  • Perfect, exactly why I love Stack! Google, find stack question and answer already here! Thanks! – Rocco The Taco Jan 02 '15 at 13:07
  • What's the syntax to group by on the UNION of the result set and then also perform order by? Let's say it's `viewCount` and `movieTitle` where there's one DB for each month. You union all 12 tables together which is fine but then you get 12 individual result sets in the output. What if you just wanted one result set where all results were grouped by `movieTitle` and the `viewCount` value was summed up for each `movieTitle` row? – fIwJlxSzApHEZIl Mar 14 '19 at 02:50
20

It sounds like you'd be happer with a single table. The five having the same schema, and sometimes needing to be presented as if they came from one table point to putting it all in one table.

Add a new column which can be used to distinguish among the five languages (I'm assuming it's language that is different among the tables since you said it was for localization). Don't worry about having 4.5 million records. Any real database can handle that size no problem. Add the correct indexes, and you'll have no trouble dealing with them as a single table.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 1
    i originally had all of my data in a single table, but it began to nearly crawl for 5-10 seconds after around 3.5 million records. i found splitting it up worked best for me because it was much faster. i have a new webhost now, so it may be better, but seems like too much of a hassle to combine it – Jayrox Jan 03 '09 at 20:12
  • 27
    Sounds like you need to add indexes to the tables. – Ned Batchelder Jan 03 '09 at 21:32
  • 2
    Yes, you essentially treated a symptom of a problem without solving the core issue (improper/insufficient indexing). What will happen next if one of your 5 tables reach 4.5M rows and starts crawling again? – Lo-Tan Nov 01 '11 at 19:25
4

Any of the above answers are valid, or an alternative way is to expand the table name to include the database name as well - eg:

SELECT * from us_music, de_music where `us_music.genre` = 'punk' AND `de_music.genre` = 'punk'
4

The column is ambiguous because it appears in both tables you would need to specify the where (or sort) field fully such as us_music.genre or de_music.genre but you'd usually specify two tables if you were then going to join them together in some fashion. The structure your dealing with is occasionally referred to as a partitioned table although it's usually done to separate the dataset into distinct files as well rather than to just split the dataset arbitrarily. If you're in charge of the database structure and there's no good reason to partition the data then I'd build one big table with an extra "origin" field that contains a country code but you're probably doing it for legitimate performance reason. Either use a union to join the tables you're interested in http://dev.mysql.com/doc/refman/5.0/en/union.html or by using the Merge database engine http://dev.mysql.com/doc/refman/5.1/en/merge-storage-engine.html.

4

Your original attempt to span both tables creates an implicit JOIN. This is frowned upon by most experienced SQL programmers because it separates the tables to be combined with the condition of how.

The UNION is a good solution for the tables as they are, but there should be no reason they can't be put into the one table with decent indexing. I've seen adding the correct index to a large table increase query speed by three orders of magnitude.

staticsan
  • 29,935
  • 4
  • 60
  • 73
3

The union statement cause a deal time in huge data. It is good to perform the select in 2 steps:

  1. select the id
  2. then select the main table with it
Jérôme Verstrynge
  • 57,710
  • 92
  • 283
  • 453