Consider a scenario where I have the data about same entity from two different sources. As an example, the camera Nikon D3200, Nikon mentions the dimensions as 5.0 in. (125 mm) x 3.8 in. (96 mm) x 3.1 in. (76.5 mm)
where as on amazon website its 3.1 x 3.8 x 5 inches
. Now if I want to combine the data from both sources in a single table, how can I go about it? How can I confirm that the these two dimensions are actually for the same product?
Any help with the strategies for record linkage will be appreciated. T
Asked
Active
Viewed 121 times
0

Brian Tompsett - 汤莱恩
- 5,753
- 72
- 57
- 129

trailblazer
- 1,421
- 5
- 20
- 43
-
Sort the dimension in descending order before comparing. You're going to have to convert metric to English (or English to metric) and get the dimension numbers from several different string formats. – Gilbert Le Blanc May 09 '14 at 18:41
1 Answers
0
What I would do is create 2 columns in a new table for each source. For example:
productID_s1 | productID_s2 | name | dimensions_source1 | dimensions_source2 CAM110 CAM101 Nikon Camera Model 82 5.0 in. (125 mm) x 3.8 in. (96 mm) x 3.1 in. (76.5 mm) 3.1 x 3.8 x 5 inches
With Query:
CREATE TABLE Destination( productID INT IDENTITY PRIMARY KEY, productID_s1 NVARCHAR(40), productID_s2 NVARCHAR(40), name NVARCHAR(40), dimensions_source1 NVARCHAR(255), dimensions_source2 NVARCHAR(255) ) SELECT s1.productID, s1.name, s1.dimensions FROM Source1 s1 UPDATE Destination SET productID_s2 = productID_fromsource2 FROM Source2 WHERE name_fromsource2 LIKE name UPDATE Destination SET dimensions_source2 = dimensions_fromsource2 FROM Source2 WHERE name_fromsource2 LIKE name
This way you know it is the same product...I don't know if this simple answer is in any way helpful.

Just Helpin'
- 1
- 1