0

I have several tables of academic journal rankings, from different sources and with different citation metrics. I want to combine them in a single table, aggregating data by journal.

For example, the first table has columns:

ISSN1 ISSN2 Journal_Name_1 Impact_Factor

and the second table has columns

ISSN3 ISSN4 ISSN5 Journal_Name_2 Citations

Journal names vary between data sources, and any journal may have 0 or more ISSN numbers. What I would like to do is creating a new table with columns

ISSN1 ISSN2 ISSN3 ISSN4 ISSN5 Journal_name_1 Journal_Name_2 Impact_factor Citations

that includes all journals listed in table 1 or table 2 by merging a row from the first table with one from the second table if the condition

( Journal_Name_1 = Journal_Name_2 ) OR ( at least one of ISSN1 or ISSN2 is equal to at least one of ISSN3, ISSN4, ISSN5)

What is the most effective way to create that result?

(The data is in CSV files at the moment, I could use sqlite/mysql/postgresql/any other DBMS)

AndreA
  • 295
  • 2
  • 12

1 Answers1

1

SQL Fiddle for playing.

select t1.*, t2.*
from t1
left join t2 on (journal_name1 = journal_name2)
                or (issn1 in (issn3, issn4, issn5))
                or (issn2 in (issn3, issn4, issn5))
union all
select t1.*, t2.*
from t1
right join t2 on (journal_name1 = journal_name2)
                 or (issn1 in (issn3, issn4, issn5))
                 or (issn2 in (issn3, issn4, issn5));
Olaf Dietsche
  • 72,253
  • 8
  • 102
  • 198