How can I split Arabic words based on connected Ligature in SQL Server, e.g
أخبارى
أ - خبا - ر - ى
أخذتهم
أ - خذ - تهم
I have tried many solution but either they are based on spaces or any deliminator, in my case there is no space.
How can I split Arabic words based on connected Ligature in SQL Server, e.g
أخبارى
أ - خبا - ر - ى
أخذتهم
أ - خذ - تهم
I have tried many solution but either they are based on spaces or any deliminator, in my case there is no space.
This is very rudimentary and should only be used as a starting point.
This is searching for each ligature and replacing that with an addition of a space.
DECLARE @word NVARCHAR(100) = N'أخبارى'
SELECT LEN(@word), @word
SELECT REPLACE(REPLACE(REPLACE(REPLACE(@word, N'أ', N'أ '), N'ى', N'ى '), N'ر', N'ر ' ), N'خب', N'خب ')
SELECT LEN(REPLACE(REPLACE(REPLACE(REPLACE(@word, N'أ', N'أ '), N'ى', N'ى '), N'ر', N'ر ' ), N'خب', N'خب ') )
You can create a table with all possible ligatures and query that using dynamic SQL following the above pattern.. I will provide an example to show what I mean