There is a table with three columns id
, VNS
, geometry
and several tens of thousands of records.
The values in geometry are a string of letters and numbers (mostly floats with 9 decimal places) with a length of 5-20 thousand characters.
The whole idea is to find matches in geometry
and output the corresponding VNS
values as a group.
I used
Select group_concat(VNS),
`geometry` ,
count(*) AS count
from geo_table
Group by geometry
HAVING count > 1
order by count DESC;
But I ran into the problem of inaccurate comparison of these lines. The search for matches works well, but not very accurately, the impression is that it compares only a part of the file, maybe half or 70%, but not 100%. I thought that the problem was in floats and tried replacing the dots with another symbol - it did not help. I tried to encode base64_encode and compare - the effect is the same. Maybe there is some more correct way to compare long strings or the settings in the database (MariaDB) are wrong?
-- phpMyAdmin SQL Dump
-- version 5.1.1
-- Хост: 127.0.0.1
-- Час створення: Жов 04 2021 р., 10:15
-- Версія сервера: 10.4.21-MariaDB
-- Версія PHP: 8.0.10
SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
START TRANSACTION;
SET time_zone = "+00:00";
CREATE TABLE `geo_table` (
`id` int(11) NOT NULL,
`VNS` varchar(30) NOT NULL,
`geometry` mediumtext NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `geo_tab_vut` (`id`, `VNS`, `geometry`) VALUES
(1, '902.12.162-01', 'I35UVElORk9fRU5EDQojfjMxDQpQDQoxDQowIzAwMDAwMDAwMDE2IzUyNTA2NDUzMDAjMDAwMD..
I cann't post full size insert query because it is too long.