0

I am trying to update a column in a table when they are duplicated rows found. Initially, I tried to partition by all columns but later found that FLOAT dtypes are not allowed.

update table tabale_name
set column_to_update =1 
where exists(SELECT *,COUNT(*) OVER (PARTITION BY column1,column2,column3) AS duplicate_count FROM `forward-cacao-351005.dq_temp.dq_sample_data`) WHERE duplicate_count > 1)

Error:-Partitioning by expressions of type FLOAT64 is not allowed at [1:120]

After this I tried to find the duplicates using google solution which is responding back with the correct records but when updating the table it updates the whole table but not where only duplicate records are present.

update `table_name` 
SET column_to_update = 1
WHERE EXCEPT(select 1 from (SELECT u.*
FROM `table_name`  u
WHERE TRUE
QUALIFY ROW_NUMBER() OVER (PARTITION BY TO_JSON_STRING(u)) > 1)
)

The above query updates all the records but not the duplicated records.

What is the best way to update the column of the same table which are having the duplicated records?

data en
  • 431
  • 1
  • 2
  • 9
  • Can you elaborate on your requirements? Is your question about removing duplicate rows from the table? – kiran mathew Aug 25 '23 at 07:33
  • No, I want to update a column with value 1 where ever there are duplicate records – data en Aug 25 '23 at 07:44
  • Hi @data en,I have posted an answer. I hope it will help you. Do consider upvoting and accepting if it helps, otherwise let me know so that I can improve my answer. – kiran mathew Aug 25 '23 at 12:09
  • Hi @data en, did you try the `UPDATE my_table SET column_to_update = 1 WHERE id IN ( SELECT id FROM my_table GROUP BY id HAVING COUNT(*) > 1 )`query I mentioned in my answer? Is that helpful? Instead of creating a new column in the table, you can update the value of any column by just changing the name of the ‘column_to_update’. – kiran mathew Aug 28 '23 at 13:45

1 Answers1

1

For your requirement,you can consider the below query as an example:

UPDATE `my_table`
SET column_to_update = 1
WHERE id IN (
  SELECT id
  FROM `my_table` GROUP BY id
  HAVING COUNT(*) > 1
);

Result:

image

To solve the FLOAT64 is not allowed error, you can consider using a cast function. For more details you can refer to this stack thread.

kiran mathew
  • 1,882
  • 1
  • 3
  • 10