0

I am trying to check if there are better solution to this query question:

Find the pairs of PC models having identical speeds and RAM.

As a result, each resulting pair is shown only once, i.e. (i, j) but not (j, i). Result set: model with higher number, model with lower number, speed, and RAM.

I came up with:

select max(model), min(model), speed, ram 
from PC 
group by speed, ram 
having count(model) = 2 and max(model) <> min(model)

and get

        speed   ram
--------------------------------    
1260    1232    500 32
1233    1232    500 64
1233    1121    750 128

The PC table looks like this:

model   speed   ram
------------------------    
1232    500 64
1121    750 128
1233    500 64
1121    600 128
1121    600 128
1233    750 128
1232    500 32
1232    450 64
1232    450 32
1260    500 32
1233    900 128
1233    800 128
Andy Lester
  • 91,102
  • 13
  • 100
  • 152
leo
  • 547
  • 6
  • 11
  • 2
    define "better": more readable? less sql? more performant? what? – Fabian Bigler Jun 12 '14 at 15:44
  • done, in terms of performance – leo Jun 12 '14 at 15:45
  • If your SQL system has the `explain` keyword it can sometimes show you what you can improve, or at least what indexes it may (or may not) be using. – Majenko Jun 12 '14 at 15:46
  • 4
    If you want us to help optimize a query, **you need to show us the table and index definitions**, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We also need row counts because that can affect query optimization greatly. If you know how to do an `EXPLAIN` or get an execution plan, put the results in the question as well. If you have no indexes, visit http://use-the-index-luke.com ASAP. – Andy Lester Jun 12 '14 at 15:46
  • Table PC has the following columns (code, model, speed, ram, hd, cd, price), where model is the primary key. I am more interested in getting better performance with better sql query itself, such as less io hit and low cpu usage etc. – leo Jun 12 '14 at 15:54
  • Which DBMS are you using? Postgres? Oracle? –  Jun 12 '14 at 16:14

1 Answers1

0

Improved indexing and proper keys (as well as, of course, a well written query) are the best way to achieving better performance. Take a look at the following.

[The following SQL Server-based SQL Fiddle example highlights the code listed below. This sample can, of course, be modified for other DBMSs.]

// Your table's columns were not detailed, so this example uses [int] datatype
CREATE TABLE [PCs] (
  [id]    int NOT NULL IDENTITY(1,1) PRIMARY KEY,
  [model] int NOT NULL,
  [speed] int NOT NULL,
  [ram]   int NOT NULL
);

INSERT INTO [PCs]
VALUES (1232, 500,  64),
       (1121, 750, 128),
       (1233, 500,  64),
       (1121, 600, 128),
       (1121, 600, 128),
       (1233, 750, 128),
       (1232, 500,  32),
       (1232, 450,  64),
       (1232, 450,  32),
       (1260, 500,  32),
       (1233, 900, 128),
       (1233, 800, 128);

Based on the generated PCs table, the following SQL statement returns all computers (in pairs, be they a single pair or multiple pairs) that share both speed and ram:

SELECT [PCs].*
FROM [PCs]
INNER JOIN (
    SELECT [speed], [ram]
    FROM [PCs]
    GROUP BY [speed], [ram]
    HAVING COUNT(*) % 2 = 0
) AS dt ON ([PCs].[speed] = dt.[speed]
            AND [PCs].[ram] = dt.[ram])
ORDER BY [PCs].[speed] ASC,
         [PCs].[ram] ASC,
         [PCs].[model] ASC;

To improve performance, I would suggest adding an INDEX on the speed and ram columns

CREATE NONCLUSTERED INDEX [IX_PCs_speed_ram] ON [PCs] ([speed], [ram]);

I hope this example helps you get to the solution that you need. Good luck.

László Koller
  • 1,139
  • 6
  • 15