4

I have a query that works on MySQL but doesn't work on Oracle, and I'm trying to convert. This is my table:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
0              123456789        index.html  2010-01-20 15:00:00
1              123456789        info.html   2010-01-20 15:00:05
2              123456789        faq.html    2010-01-20 15:00:15
3              987654321        index.html  2010-01-20 16:00:00
4              987654321        faq.html    2010-01-20 16:00:05
5              987654321        info.html   2010-01-20 16:00:15
6              111111111        index.html  2010-01-20 16:01:00
7              111111111        faq.html    2010-01-20 16:01:05
8              111111111        info.html   2010-01-20 16:01:15

The SQL is

select http_session_id, unique_row_id, page_name, page_hit_timestamp 
from page_hits 
group by http_session_id;

On MySQL, this will return 3 rows (one for each unique http_session_id).

On Oracle, I get a "ORA-00979: not a GROUP BY expression" error. I've tried playing around with distinct too, but I can't get it to work.

Just to be clear - I would like a ResultSet that contains one row per unique http_session_id. It is preferable that the unique_row_id would be the max one (e.g. 2 for http_session_id==123456789), but this is not significant.

I'm on the verge of breaking this into multiple separate sql statements (one "select distinct http_session_id", and the other to iterate through all these and select the max(unique_row_id). Any pointers would be gratefully received - I would love to avoid this!

Rgds, Kevin.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Kevin
  • 11,521
  • 22
  • 81
  • 103

5 Answers5

10

The reason you encounter the ORA error is because MySQL supports non-standard GROUP BY clauses, calling it a "feature". It's documented here.

The standard SQL GROUP BY clause must include ALL columns specified in the SELECT clause, that are not wrapped in aggregate functions (LIKE COUNT, MAX/MIN, etc), to be specified in the GROUP BY clause.

If you want one, unique row per http_session_id value - look at using ROW_NUMBER:

SELECT x.*
  FROM (select http_session_id, unique_row_id, page_name, page_hit_timestamp,
               ROW_NUMBER() OVER (PARTITION BY http_session_id 
                                      ORDER BY http_session_id) AS rank
          FROM page_hits) x
 WHERE x.rank = 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • You can add an ORDER BY to the ROW_NUMBER if you want to tweak which row is returned. MySQL doesn't have analytical/windowing/rank functionality, so you can't port the query back to MySQL I'm afraid :( – OMG Ponies Jul 06 '10 at 22:24
  • Thanks for the comprehensive answer - useful to understand that MySQL's Group By is non-standard. BTW - I got a ORA-30485 error executing the select clause specified. – Kevin Jul 06 '10 at 22:38
  • I believe that select is SQL Server syntax. – Dave Neeley Jul 06 '10 at 23:04
  • 1
    @Kevin: Interesting - [ORA-30485 is because Oracle expects an ORDER BY clause when using an analytic function](http://www.google.ca/search?q=ORA-30485&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a). I updated to correct that. – OMG Ponies Jul 06 '10 at 23:07
2

Would this work:

select max(unique_row_id), http_session_id
from page_hits
group by http_session_id

Incidentally; what does my sql return in your resultset for columsn that are included in the resultset but not in the group by clause? (page_name, page_hit_timestamp)

Dog Ears
  • 9,637
  • 5
  • 37
  • 54
  • Hi - yep, this does the trick. In full: select http_session_id, max(unique_row_id), max(page_name), max(page_hit_timestamp) from page_hits group by http_session_id; – Kevin Jul 06 '10 at 22:31
  • And to answer your question about what MySQL does :- it seems to return the columns from the row associated with the first matching http_session_id. I can't be certain because I haven't tested it too much - it could well return last under different conditions. – Kevin Jul 06 '10 at 22:33
  • yep, I'll change to first(page_name). – Kevin Jul 07 '10 at 07:23
0

I think that a GROUP BY requires a variable to have been used in a WHERE clause or aggregation function in the SQL standard?

Try using SELECT MAX(unique_row_id) GROUP BY http_session_id.

Borealid
  • 95,191
  • 9
  • 106
  • 122
0

In standard SQL, if you have a GROUP BY clause, all columns that are not part of it have to be in aggregates. In MySQL, this rule was relaxed by design.

For instance, this is allowed in MySQL but not in standard SQL:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id

There's one caveat: MySQL assumes you know what you're doing. If the same customer has records in multiple countries, the query will simply grab the first country in the table, disregarding all others. Furthermore, since the order of rows is undefined, and there is no ORDER BY, you might get different results each time you run the query.

In Standard SQL, you have two choices:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id, country

or

SELECT customer_id, MIN(country), SUM(amount) FROM records GROUP BY customer_id
MapDot
  • 315
  • 1
  • 7
0

Another option in Oracle, if you want:

select DISTINCT
       FIRST_VALUE(unique_row_id)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) unique_row_id,
       http_session_id,
       FIRST_VALUE(page_name)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_name,
       FIRST_VALUE(page_hit_timestamp)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_hit_timestamp
from page_hits;

This will get a distinct set of http_session_id's, and for each, returns the unique_row_id, page_name and page_hit_timestamp from the row with the greatest unique_row_id for that http_session_id, e.g.:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
2              123456789        faq.html    2010-01-20 15:00:15
5              987654321        info.html   2010-01-20 16:00:15
8              111111111        info.html   2010-01-20 16:01:15
Jeffrey Kemp
  • 59,135
  • 14
  • 106
  • 158