I'm importing data into a ClickHouse table from CSV files.
cat data.csv | clickhouse-client --config-file=config.xml --query="INSERT INTO data_pool FORMAT CSVWithNames"
Often CSV files contain duplicate entries that are already in the ClickHouse table. What is the most efficient way to insert new data from a CSV file, skipping the entries already in the table?