I have data like 'John is my name; Ram is my name; Adam is my name'
.
My rule is such that every first letter that comes after ;
should be capital letter.
How do I select all values that satisfy the rule?
I have data like 'John is my name; Ram is my name; Adam is my name'
.
My rule is such that every first letter that comes after ;
should be capital letter.
How do I select all values that satisfy the rule?
You might split this with the XML trick like this
DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; Adam is my name';
WITH Splitted AS
(
SELECT CAST('<x>' + REPLACE((SELECT REPLACE(@YourString,'; ','$$SplitHere$$') AS [*] FOR XML PATH('')),'$$SplitHere$$','</x><x>')+ '</x>' AS XML) AS Casted
)
,DerivedTable AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNr
,x.value(N'text()[1]',N'nvarchar(max)') AS Part
FROM Splitted
CROSS APPLY Casted.nodes(N'/x') AS X(x)
)
SELECT PartNr
,Part
,CASE WHEN ASCII(LEFT(Part,1)) BETWEEN ASCII('A') AND ASCII('Z') THEN 1 ELSE 0 END AS FirstIsCapital
FROM DerivedTable;
The result
Nr Part FirstLetterIsCaptial
----------------------------------------
1 John is my name 1
2 Ram is my name 1
3 Adam is my name 1
I do not know what your final goal is... Find parts, where the first letter is not capital? Make sure, that your rule is fullfilled?
However:
Best was, to use this to correct your design and place these parts in a 1:n
related side table.
The other answers show how to transform rows into something that matches your pattern.
If you just want to select
the rows that match the pattern you are describing, you can use patindex()
or like
with a case sensitive collation (or use collate
to apply one).
This assumes that in addition to the rule that every letter that follows a semicolon must be a capital letter, that the very first letter should also be capital. If that is not the case, just remove the first clause in the where
.
select *
from t
where patindex('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 1
and patindex('%; [^ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 0
select *
from t
where val collate latin1_general_cs_as like '[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%'
and val collate latin1_general_cs_as not like '%; [^ABCDEFGHIJKLMNOPQRSTUVWXYZ]%'
test setup:
create table t (id int not null identity(1,1),val varchar(256))
insert into t values
('John is my name; Ram is my name; Adam is my name')
,('john is my name; ram is my name; adam is my name')
rextester demo: http://rextester.com/DBGIS10645
Both of the above return:
+----+--------------------------------------------------+
| id | val |
+----+--------------------------------------------------+
| 1 | John is my name; Ram is my name; Adam is my name |
+----+--------------------------------------------------+
Little bit of ugly solution but you can give a try...
Declare @str nvarchar(max) = 'John is my name; Ram is my name; Adam is my name'
Declare @xml as xml
Set @xml = cast(('<X>'+replace(@str,';' ,'</X><X>')+'</X>') as xml)
Select * from (
Select RowN = Row_Number() over (order by (SELECT NULL)), LTrim(RTrim(N.value('.', 'nvarchar(MAX)'))) as value FROM @xml.nodes('X') as T(N) -- this is to split if you are using sql server 2016 you can use string_Split
) a
Where unicode(substring(a.[value],1,1)) = unicode(upper(substring(a.[value],1,1)))
Idea is to split the string and check with unicode value to see whether it is upper or not
you can create a function like this.
Create FUNCTION SPLITTER (
@textData NVARCHAR(MAX),
@Delimeter NVARCHAR(MAX) ) RETURNS @RtnValue TABLE (
Data NVARCHAR(MAX) ) AS BEGIN
DECLARE @index INT DECLARE @data nvarchar(1000) DECLARE @firstCharacter char
SET @index = CHARINDEX(@Delimeter,@textData)
WHILE (@index>0)
BEGIN
set @data = LTRIM(RTRIM(SUBSTRING(@textData, 1, @index - 1))) set @firstCharacter = SUBSTRING(@data,1,1);
if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter)) begin INSERT INTO @RtnValue (data) SELECT @data end;
SET @textData = SUBSTRING(@textData, @index + DATALENGTH(@Delimeter) / 2, LEN(@textData))
SET @index = CHARINDEX(@Delimeter, @textData)
END
set @data = @textData set @firstCharacter = SUBSTRING(@data,1,1);
if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter)) begin INSERT INTO @RtnValue (data) SELECT @data end;
RETURN END
use it like this
select * from SPLITTER ('John is my name; Ram is my name; Adam is my name', ';')
You can grab a copy of NGrams8K and do this:
-- note that I made the 3rd item start with lower-case
DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; adam is my name';
WITH D(n) AS
(
SELECT 0 UNION ALL SELECT position
FROM dbo.NGrams8k(@yourstring,1) WHERE token = ';'
),
TOKEN(token) AS
(
SELECT LTRIM(SUBSTRING(@YourString, N+1,
ISNULL(NULLIF(CHARINDEX(';', @YourString, N+1),0), 101)-(N+1)))
FROM D
)
SELECT token,
FirstLetterIsCaptial = IIF(ASCII(SUBSTRING(token,1,1)) BETWEEN 65 AND 90, 1, 0)
FROM TOKEN;
Results
token FirstLetterIsCaptial
------------------ --------------------
John is my name 1
Ram is my name 1
adam is my name 0