0

The question's title might not be optimal.

What i'm trying to achieve is to retrieve company_id from table site where site has not been deleted. But, a company has multiple site. I don't want to select a company that has one or more non-deleted website.

SELECT DISTINCT(company_id) FROM site WHERE _isDeleted = 1;

Gives me the companies which has a site deleted (8000+ results).

But when I try:

SELECT DISTINCT(company_id) FROM site WHERE _isDeleted = 1 
AND company_id NOT IN (SELECT company_id FROM site WHERE _isDeleted = 0);

It gives me 0 results. Why ?

Performance is not at question there because it will be a run once policy query.

sf_tristanb
  • 8,725
  • 17
  • 74
  • 118
  • 2
    Is it possible that there is a null value for company_id? because that would cause problems with using the "Not In" statement, and would require "Not Exists" instead. I suppose the other question is have you made sure that your subquery, when run by itself, has results that are not distinct from your primary query's results for company_id. – TPhe Mar 23 '15 at 18:13
  • I tried `SELECT DISTINCT(company_id) FROM site WHERE _isDeleted = 1 AND NOT _isDeleted = 0;` and it shows one null result. I don't know if it's the result i'm looking for but you're right about the null value – sf_tristanb Mar 23 '15 at 18:14
  • To confirm that there is a null value for company id in your subquery, I would run this : SELECT company_id FROM site WHERE _isDeleted = 0 and company_id is null – TPhe Mar 23 '15 at 18:20
  • The first query tells me you have 8000+ companies with at least one deleted site. The second tells me that of those 8000+ companies, all of them have at least one site that is NOT deleted. Perhaps 0 is a valid return? – Russ Mar 23 '15 at 20:46

2 Answers2

1

I tend to really dislike most uses of the idiom NOT IN (<subquery>). For this particular need, I would consider a query such as this, instead:

SELECT company_id
FROM site
GROUP BY company_id
HAVING COUNT(*) != COUNT(NULLIF(_isDeleted, 0))

That accommodates any value of _isDeleted, with NULL being effectively equivalent to 0, and all other values being effectively equivalent to 1. I also find its meaning a bit clearer than the original query's, and if you cared about performance then it would matter that it probably performs better.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
1

Based on your requirements, you need all companies from the table site in which the site is not deleted, you can easily do the following

select distinct company_id from site where _isdeleted=0 and company_id is not null

and to get companies that has no site deleted, you can use

select distinct company_id from site p 
where not exists(select 1 from site x 
                 where x.company_id=p.company_id and x._isdeleted=1)
and p._isdeleted=0

hope it will help you

Louay Fadel
  • 70
  • 1
  • 8