I want to convert the Azure Synapse script into GCP BigQuery. The Azure Datalake script is written in T-SQL, I want to convert it to BigQuery script. Please guide me - is there any procedure to convert a T-SQL query to BigQuery, like similar to SQL ? Thank you
-
"SQL" is a standard, "TSQL" and "BigQuery" (now called "GoogleSQL") are different implementations of that standard and each implementation differs in some details.(Often in what functions exist, particularly date functions). There may be commercial products that will assist in such a conversion but even they may ne be perfect. Simple TSQL will transfer quite easily, but it gets harder as the code gets more complex. It will become very apparent what things cause conversion issues. – Paul Maxwell May 07 '23 at 06:12
1 Answers
A tool is the BigQuery Migration Service . It includes tools that help with migration, including assessment and planning, SQL translation, data transfer, and data validation. The batch SQL translator and the interactive SQL translator can be used to prepare SQL queries and scripts to work in BigQuery. The batch and interactive SQL translators support translation from a wide range of SQL dialects. Suggestions:
- Learn the differences between TSQL and BigQuery SQL: BigQuery uses a different SQL dialect than TSQL, so it's important to understand the differences between the two.
- Utilize the BigQuery Migration Service.
- Optimize queries for BigQuery: This can include using partitioning and clustering to reduce query costs and perhaps using denormalization to reduce the number of joins.
- Use BigQuery's native data types: BigQuery has its own set of native data types, so use them - when relevant.
- Test thoroughly: Before migrating TSQL code to BigQuery, it's important to test thoroughly to ensure that the code works as expected.
Here's a phony TSQL query as an example for conversion, note the various difference in functions:
SELECT o.OrderID, c.CustomerName, CONVERT(varchar, o.OrderDate, 103) AS OrderDate, p.ProductName
FROM Orders o
OUTER APPLY (
SELECT TOP 1 c.CustomerName
FROM Customers c
WHERE c.CustomerID = o.CustomerID
) c
OUTER APPLY (
SELECT p.ProductName
FROM OrderDetails od
INNER JOIN Products p ON od.ProductID = p.ProductID
WHERE od.OrderID = o.OrderID
ORDER BY od.Quantity DESC
OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY
) p
WHERE JSON_VALUE(o.OrderDetails, '$.isShipped') = 'true'
This query retrieves data from the Orders table, joining it with the Customers and OrderDetails tables using OUTER APPLY. It also uses the JSON_VALUE function to extract a value from a JSON column in the Orders table, and CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.
The first OUTER APPLY retrieves the CustomerName from the Customers table for each order. The second OUTER APPLY retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables.
The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).
Here's an equivalent query for BigQuery based on the previous T-SQL query:
SELECT o.OrderID, c.CustomerName, FORMAT_DATE('%d/%m/%Y', o.OrderDate) AS OrderDate, p.ProductName
FROM Orders o
LEFT JOIN (
SELECT CustomerID, CustomerName
FROM Customers
) c ON c.CustomerID = o.CustomerID
LEFT JOIN (
SELECT OrderID, ProductName
FROM (
SELECT od.OrderID, p.ProductName, ROW_NUMBER() OVER (PARTITION BY od.OrderID ORDER BY od.Quantity DESC) AS rn
FROM OrderDetails od
JOIN Products p ON od.ProductID = p.ProductID
) t
WHERE t.rn = 1
) p ON p.OrderID = o.OrderID
WHERE JSON_EXTRACT(o.OrderDetails, '$.isShipped') = 'true'
This query uses LEFT JOIN instead of OUTER APPLY to join the Customers and OrderDetails tables with the Orders table. It also uses FORMAT_DATE('%d/%m/%Y', date_column) instead of CONVERT(varchar, date_column, 103) to format the OrderDate column as dd/mm/yyyy.
The first LEFT JOIN retrieves the CustomerName from the Customers table for each order. The second LEFT JOIN retrieves the ProductName with the highest quantity for each order from the OrderDetails and Products tables using a subquery with ROW_NUMBER().
The WHERE clause filters the results to only include orders that have been shipped (isShipped is true in the JSON column).

- 33,002
- 3
- 32
- 51