You can use SELECT DISTINCT to deduplicate your table.
CREATE OR REPLACE TABLE
`project.dataset.table` -- this will overwrite your table
AS
SELECT DISTINCT
record_id,
session_id
FROM
`project.dataset.table`
Create a new table
Replace the table to be replaced or created with a new name. This will create a new table. Running the query again will recreate/overwrite the new table.
CREATE OR REPLACE TABLE
`project.dataset.table_deduped` -- change to a new table
AS
SELECT DISTINCT
record_id,
session_id
FROM
`project.dataset.table`
Some potential drawbacks:
- you are storing your data twice. (The duplicated and deduplicated tables)
- You have to recreate the table each time you get new data in your original table, however you can instead of recreating it, insert new data via a merge statement
Create a VIEW
A view is a virtual table based on a query. The data is not stored in a new table. You query it the same way you query a table. Can be useful if the underlying table or tables gets updated and you always want an up to date 'view' of the data.
CREATE VIEW
`project.dataset.table_deduped` -- change to a new table
AS
SELECT DISTINCT
record_id,
session_id
FROM
`project.dataset.table`
Create a temporary table that you can reference inside the same query.
WITH
deduped_table AS (
SELECT DISTINCT
record_id,
session_id
FROM
`project.dataset.table`
)
SELECT
*
FROM
deduped_table
All the options have their pro's and cons. Depending on what you want to achieve, how your data is structured and how much you need to care about the amount of data your are querying/storing. If you only have the two columns it's probably not that big of a deal but worth noting your options in any case I rate.