Get distinct rows based on one column with T-SQL

Question

I have a column in the following format:

Time  Value
17:27   2
17:27   3

I want to get the distinct rows based on one column: Time. So my expected result would be one result. Either 17:27 3 or 17:27 3.

Distinct

T-SQL uses distinct on multiple columns instead of one. Distinct would return two rows since the combinations of Time and Value are unique (see below).

select distinct [Time], * from SAPQMDATA

would return

Time  Value
17:27   2
17:27   3

instead of

Time  Value
17:27   2

Group by

Also group by does not appear to work

select * from table group by [Time]

Will result in:

 Column 'Value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Questions

How can I select all unique 'Time' columns without taking into account other columns provided in a select query?
How can I remove duplicate entries?

can you say why you showed a "3" in your desired result? Why not 2? If it can be either, then just use another aggregate function on your "value" coulmn. "MAX" or "MIN", but I wouldn't give that as the answer until I know your answer to my questions here. not sure yet. — jamie, Jun 05 '19 at 15:44

score 0 · Accepted Answer · answered Jun 05 '19 at 18:54

This is where ROW_NUMBER will be your best friend. Using this as your sample data...

time                 value
-------------------- -----------
17:27                2
17:27                3
11:36                9
15:14                5
15:14                6

.. below are two solutions with that you can copy/paste/run.

DECLARE @youtable TABLE  ([time] VARCHAR(20), [value] INT);
INSERT  @youtable VALUES ('17:27',2),('17:27',3),('11:36',9),('15:14',5),('15:14',6);

-- The most elegant way solve this
SELECT TOP (1) WITH TIES t.[time], t.[value]
FROM   @youtable AS t
ORDER BY ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL));

-- A more efficient way solve this
SELECT t.[time], t.[value]
FROM
(
  SELECT t.[time], t.[value], ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL)) AS RN
  FROM   @youtable AS t
) AS t
WHERE t.RN = 1;

Each returns:

time                 value
-------------------- -----------
11:36                9
15:14                5
17:27                2

Thank you Alan this answers my first question and I will accept your answer for this. However my second question remains unanswered. Would you mind giving an answer on this question? I understand the subquery is providing an extra column called 'RN'. This column numbers the amount of times a 'Time' value exists. RN is always 1. When a second 'Time' value has been found RN = RN + 1 (RN=2). The previous found row of this 'Time' value still has the old RN value (in this case RN=1). I am wondering how I can delete duplicate 'Time' rows (using this logic). — Jouke, Jun 06 '19 at 08:27
For example. I have three entries (T=0 V=1), (T=0 V=2), (T=1 V=1). How can I select only the rows where T doesn't exist multiple times? In this case the query should return T=1 V=1. — Jouke, Jun 06 '19 at 08:33

Get distinct rows based on one column with T-SQL

Distinct

Group by

Questions

1 Answers1