0

I have created two tables Industry and Salary with a many to many relationship

Industry table

  • Industry PK

Salary

  • SalaryID pk

  • Income

  • Year

IndustrySalary junction table

  • Industry FK
  • SalaryID FK

I have crawled data from two different sites with one site containing information about the industry names and the other with income level, industy that it is relates to and the year.

Whats the best way i should go about to populate the junction table?

I was thinking of storing the data gathered from the salary site in an array [ [industry,salary,year] ,[industry,salary,year]...]

Populating the industry and salary table

For each array stored in the array, check if industry has been stored in the industry DB table, get its ID

Check the remaining year and salary in that appended array if it exists in the salary DB table, get its ID

Insert ID of both tables into junctiom table

Repeat these steps until all arrays in the larger array has been accounted for

Any opinions kn this current method and its viability?

Yeo Bryan
  • 331
  • 4
  • 24

1 Answers1

0

Assuming the Salary table has the industry ID in it, I would suggest write a SQL query, similar to:

SELECT DISTINCT IndustryID,SalaryID from Salary

And do an INSERT INTO the junction table.

The actual syntax would depend upon your SQL product

Sparky
  • 14,967
  • 2
  • 31
  • 45
  • But if i store the industryID in the salary table i wouldnt need the junction table anymord? – Yeo Bryan Oct 04 '19 at 05:12
  • No you wouldn't. What key in the salary table lets you determine the industry? IF we know that, we can adjust the query – Sparky Oct 04 '19 at 14:21