0

I have views in two databases V1 and V2 with same set of columns col1, col2.

Now I would like to know if they contains identical rows. In this context I have gone through this post: SQL compare data from two tables.

Following this I am running the below query:

  select * from v1.A
  minus
  select * from v2.A;

But I am getting the below error:

 ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'select * from v2.A' at line 3

I am not getting any clue regarding how to fix this error.

Joy
  • 4,197
  • 14
  • 61
  • 131
  • *I would like to know if they contains identical rows* what does *identical* mean? How come 2 datasets that return the same number of rows be *identical*? – forpas Apr 04 '20 at 13:33
  • @forpas, I have a use case where I am running two experiments and dumping the data in two tables in two different databases with same schema. – Joy Apr 04 '20 at 13:36
  • Fine, but I'm sure that what you don't want is to consider identical 2 tables with the same number of rows, right? My question was posted after you accepted the answer that did exactly that. – forpas Apr 04 '20 at 13:37
  • Yeah you are absolutely right. @forpas, I mistakenly accepted this. I realiized even if the values of a field differ in two tables, then also I am getting count1+count2=count3 – Joy Apr 04 '20 at 13:40

1 Answers1

1

You can simulate minus which is not supported in MySql with NOT EXISTS:

select t1.* from v1.A t1
where not exists (
  select * from v2.A t2
  where t2.col1 <=> t1.col1 and t2.col2 <=> t1.col2
)

I use the NULL-safe equal operator <=> instead of just = just in case there are nulls that need to be compared.

But if this query does not return any results it does not mean that the 2 views return the same rows, because v1.A may return a subset of the rows returned by v1.B.
So you also need:

select t2.* from v2.A t2
where not exists (
  select * from v1.A t1
  where t2.col1 <=> t1.col1 and t2.col2 <=> t1.col2
)

and maybe better a UNION ALL of the 2 queries to be sure:

select 1 from_view, t1.* from v1.A t1
where not exists (
  select * from v2.A t2
  where t2.col1 <=> t1.col1 and t2.col2 <=> t1.col2
)
union all
select 2 from_view, t2.* from v2.A t2
where not exists (
  select * from v1.A t1
  where t2.col1 <=> t1.col1 and t2.col2 <=> t1.col2
)
forpas
  • 160,666
  • 10
  • 38
  • 76