1

How do I adapt the query below so that it doesn't perform unnecessary UNION ALL unless the SELECT statement above it doesn't find a match?

SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address = ? AND community = ?
UNION ALL
SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text = ? AND community::text LIKE ? 
UNION ALL
SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text LIKE ? AND community::text LIKE ?

My coordinates table has columns | ID | ADDRESS | CITY | LATITUDE | LONGITUDE

I'm trying to extract values from latitude and longitude in the most efficient manner, while also performing the strongest queries first, meaning those SELECT statements that will return the most reliable result. If a result is found, return it and discard the remaining queries.

Right now I have all of my queries in an ArrayList which I then loop and execute individually. If a match() is found then I return the resultset. The queries at the top of the list are those that are stricter, using = instead of LIKE, getting gradually less strict if no results are found.

This is obviously inefficient since I'm performing nearly 100 SELECT queries separately, so I've looked into using UNION ALL and my plan is to create one giant query with UNION ALL in between each SELECT statement. However I'm now confused how to abort any subsequent queries if a match is found?

Current approach:

            queryList.add("SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address = ? AND community = ?");
            queryList.add("SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text = ? AND community::text LIKE ?");
            queryList.add("SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text LIKE ? AND community::text LIKE ?")        

            PreparedStatement pStatement = null;
            ResultSet rs = null;
            Connection conn = null;

            for (String currentQuery: queryList) {
                Connection conn = getConnection();
                pStatement = conn.prepareStatement(currentQuery);
                pStatement.setString(1, "742 Evergreen Terrace");
                pStatement.setString(2, "Springfield");
                // execute query
                // if match, break loop and use resultset
            } 

So with UNION ALL, how do I abort if the first SELECT finds a match?

EDIT:

I'm trying this query, but it returns ALL latitutudes and longitudes instead of just the ones which caused the match to be found. How do I alter this to only return the relevant row?

SELECT LATITUDE, LONGITUDE FROM coordinates
WHERE EXISTS(SELECT LATITUDE, LONGITUDE FROM coordinates WHERE 
(address = ? AND community = ?) OR WHERE (address::text = ? AND community::text LIKE ?))
IanSky
  • 31
  • 6

2 Answers2

0

You have to dummy the data. Make the first select some dummy record so that it never fails. or determine what will not fail before you start. You can then use a nested select to remove your dummy row. Basic syntax example.

select Latitude, longitude from
(
select 'dummy data' as Lattitude, 'AltDummy' as Longitude
union all 
SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address = ? AND    community = ?
UNION ALL
SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text = ? AND community::text LIKE ? 
UNION ALL
SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address::text LIKE ? AND community
) x
where Lattitude <> 'dummy data' and Longitude <> 'AltDummy' 

You will also need to cast your dummy data in the exact same data type as your actual data for this to work right 100% of the time. Submit it to the sql server all as one query and let the sql server do the work.

M T Head
  • 1,085
  • 9
  • 13
  • I don't see how this answers my question of how to **abort subsequent `UNION ALL` commands if a match is found**. How does creating a dummy row (?) in my table cause subsequent `UNION ALL` commands to abort if a match is found above it? – IanSky Jul 11 '17 at 19:10
  • Sorry I did not understand what you seek. I thought you were getting an error. To do what you seek you need a stored procedure. You can't do it in just straight sql. – M T Head Jul 11 '17 at 19:37
  • Actually if you do the first select into a temp table. Then check the count of the temp table and optionally exit you can do this. But temp tables can be slow with large data sets. – M T Head Jul 11 '17 at 19:39
  • You can select count(*) on the first query. If you get what you want then do the rest of the query. This will speed you up when you want to exit out. But you will have to run the first query twice if you run all the parts of the union. – M T Head Jul 11 '17 at 19:42
  • What Sql platform are you on? – M T Head Jul 11 '17 at 19:42
  • Thanks for that temp table suggestion, I'll look into it now and report back. I'm using H2. Do you know of any helpful guides to creating temp tables + optionally exiting? I'm assuming this is done in a query and not stored procedure? – IanSky Jul 11 '17 at 19:45
  • sorry h2? (Two possible solutions as I see it. Temp tables or use of rowcount) – M T Head Jul 11 '17 at 19:46
  • Ahhhh h2 is a java sql engine. I did a web search. Now I understand you. – M T Head Jul 11 '17 at 19:47
  • Exactly, I'm trying to speed up my Java code by optimizing the query by returning as soon as a match is found (and using that match's row data) Do you know of any helpful guides to creating temp tables + optionally exiting? I'm assuming this is done in a query and not stored procedure? – IanSky Jul 11 '17 at 19:49
  • I've updated my original question to show how I'm using `WHERE EXISTS` to return as soon as a match is found, but it returns ALL rows instead of just the row causing the match. Do you know how to adapt this, or if `WHERE EXISTS` is the right approach? – IanSky Jul 11 '17 at 19:54
  • 1
    I don't know h2. But I know loads of Sql platforms. I can give you the pseudo code. 1 select first query into the temp table (no longer use union all) After that finishes 2 select count of the temp table. 3 if your condition is met based on the count, continue or select * from the temp table returning it to the requester. 4 if you continue select into the temp table created with the first step. Loop as needed. Try doing this is segments. First just create a temp table. Second populate the temp table, third test branching logic. – M T Head Jul 11 '17 at 19:55
  • Where exists is for sub queries. I don't see how that will help you here. – M T Head Jul 11 '17 at 19:56
  • Cool, that was well explained. I'll do my best and will only report back if super stuck. Thanks again – IanSky Jul 11 '17 at 19:58
  • The question is a little confusing as to what you are seeking. Can you possibly clean it up a bit. It is not generally a good practice to do a select, then select from the select for something else. Why not just select what you want the first time with all your detail. I can't extract from your question exactly what data you are seeking..... – M T Head Jul 11 '17 at 19:58
  • My main goal is to `SELECT` the latitude and longitude (2 values total) from the FIRST row that satisfies my query. My query consists of 75+ `SELECT` statements which differ slightly from one another. I want a match to be returned **as soon** as a `SELECT` statement is satisfied, instead of having the SQL server execute the entire query if not necessary. The result must contain the latitude/longitude for the row satisfying the `SELECT`. I think you're onto something with the temp table thing, unless creating a new table each time will hog resources everytime it doesn't find a match. – IanSky Jul 11 '17 at 20:06
  • You can speed this up by doing some sort of a limit in your first query set to 2. What that means is if the database has 1000 records and the first two records meet your criteria it will stop reading the 1000 records and only read the first two. That should speed you up quite a bit. – M T Head Jul 11 '17 at 21:01
  • Temp table with a limit of 2 rows will be fast. Temp tables are slow with high volume. – M T Head Jul 11 '17 at 21:02
  • SELECT LATITUDE, LONGITUDE FROM coordinates WHERE address = ? AND community = ? LIMIT 2 – M T Head Jul 11 '17 at 21:04
  • I'm assuming this all must be done in stored procedures, since we're talking about loops. H2 doesn't seem to support stored procedures directly, but uses aliases in the Java code. Seems superfluous but I'll see what comes of it. By the way, wouldn't the `LIMIT` only need to be 1 not 2, since my latitude and longitude come from the same row? Thank you – IanSky Jul 11 '17 at 21:15
  • So far this can be done in a script and does not have to be a stored procedure. But it is easier in a stored procedure. select into temp table with limit 2 and you have the first step. I don't know H2 well enough to know how to do this but you can do this in Mysql, Pgsql, Oracle, Db2, Sybase with out stored procedures. Does require some branching logic though. – M T Head Jul 11 '17 at 21:23
  • I haven't forgot about you, and in fact have been working tirelessly to wrap my head around this dilemma. I think I got ahead of myself and have since tried to re-iterate my problem in a fresh question found [here](https://stackoverflow.com/questions/45089887/efficiently-query-an-embedded-database-in-a-loop-with-prepared-statements). No obligation to respond, but if you have any solution I'll happily award you. Thanks for your time thus far :) – IanSky Jul 13 '17 at 19:57
0

Other approach:

SELECT LATITUDE, LONGITUDE FROM coordinates 
WHERE (address = ? AND community = ?)
or (text = ? AND community::text LIKE ? )
or (text LIKE ? AND community::text LIKE ?) 

Same net result. Speed differences vary depending on what platform you are working on. But this is likely faster than the previous.

M T Head
  • 1,085
  • 9
  • 13
  • Thanks, I spent the past few hours manually altering my code and it's slower now than before. This is because using `OR` doesn't abort the subsequent `WHERE` clauses if a match is found. Instead, your suggestion means the ENTIRE query must be executed in full, before any match (or not) will be returned. Perhaps I'm missing something?? – IanSky Jul 11 '17 at 19:08
  • No that is correct. I was not fully following what was requested. Depending on the server you will see faster or slower results with the different approaches. Try the temp table option above. That will reduce the reads which is what you seem to seek. – M T Head Jul 11 '17 at 19:40