0

I've got a column in a database table that is stored like the below that I'm trying to split out into columns to be able to report on it a lot easier:

  Question 1 : ["Yes","","b1"];  Question 2 : ["","No","b2"];  Question 3: ["Yes","","b3"];  Question 4: ["","No",""];  Question 5: ["Yes","","b5"];  Question 6: ["","No","b6"];  Question 7: ["Yes","","b7"];  

From the below table, I've got this so far:

SELECT
    *,
    CASE 
        WHEN LEN(ItemValues) > 1 THEN 
            LEFT(ItemValues, charindex(':', ItemValues) - 1)
    END as Question,
    '' as Answer,
    '' as Comment
FROM 
(
    SELECT 
        ID,
        TRIM(value) as ItemValues
    FROM 
        #StackQuestion
    CROSS APPLY STRING_SPLIT(Response,';')
) t1
where
    LEN(ItemValues) > 1

What I'm really struggling with is populating the Answer and Comment columns. The Answer column should contain "Yes" or "No", and the Comment column should contain the final part which is b1 for example.

Any ideas?

Create Table #StackQuestion
(
    ID int IDENTITY(1,1), 
    Response varchar(2000) 
)

insert into #StackQuestion
(
    Response
)
select
'
Question 1 : ["Yes","","b1"]; 
Question 2 : ["","No","b2"]; 
Question 3: ["Yes","","b3"]; 
Question 4: ["","No",""]; 
Question 5: ["Yes","","b5"]; 
Question 6: ["","No","b6"]; 
Question 7: ["Yes","","b7"]; 
'
union all
select
'
Question 1 : ["","No","comment1"]; 
Question 2 : ["","No","c2"]; 
Question 3: ["Yes","","c3"]; 
Question 4: ["Yes","","c4"]; 
Question 5: ["Yes","","b5"]; 
Question 6: ["","No","b6"]; 
Question 7: ["Yes","","b7"]; 
'
Philip
  • 2,460
  • 4
  • 27
  • 52

2 Answers2

2

I had a quick play around with some additional string manipulation, perhaps something like the following will work for you?

select ID, 
  Trim(Left(ItemValues, CharIndex(':', ItemValues) - 1)) as Question,
  case when ItemValues like '%"Yes"%' then 'Yes' else 'No' end as Answer,
  comment
from #StackQuestion
cross apply (
  select value as ItemValues, Reverse(Left(c, CharIndex('"',c) -1)) comment
  from String_Split(Response, ';')
  cross apply(values(Reverse(Replace(value, '"]',''))))v(c)
  where Len(value) > 3
)r;

Note that although in practice you will probably be ok, using string split is documented to not be relied upon for ordering.

Stu
  • 30,392
  • 6
  • 14
  • 33
  • Cheers Stu - that delivers the perfect output except for one minor point. Do you happen to know why the Question column has these rogue spaces at the front, even though you have TRIM? – Philip Jul 21 '22 at 22:20
  • 1
    probably the carriage returns in your sample data. – Stu Jul 21 '22 at 22:22
  • 1
    Quite right, a replace clears that up. – Philip Jul 21 '22 at 22:25
  • "not be relied upon for ordering" that's the reason I am using own CLR based function returning additional column with word number – DueGe Jul 21 '22 at 22:33
  • A small issue with this solution: if the Answer is "No" but the Comments is "Yes" then you would get "Yes" as answer. It would be easy to get around though. – tinazmu Jul 21 '22 at 22:39
  • Thanks tinazmu - looks like @Stu modified his answer :) – Philip Jul 21 '22 at 22:48
  • @DueGe - your suggested answer doesn't appear or maybe you deleted it. – Philip Jul 21 '22 at 22:49
  • @Philip If you accept the solution using the CLR - pls let me know or start with e.g. this: [link](https://stackoverflow.com/questions/2451656/identity-column-in-sql-clr-split-udf) – DueGe Jul 22 '22 at 10:04
1

Since the answers list to each of the questions is actually valid JSON syntax, you can use the OPENJSON function, which will output the answers as different records. The PIVOT operator can be used to get the values on the same line again.

Here is an adaptation of the code that does the job. Note that I replaced the len(ItemValues) > 1 condition with charindex(':', ItemValues) >= 1 in order to get rid of errors on white space strings that were longer than 1 character.

with prep as (
    SELECT
        *,
        LEFT(ItemValues, charindex(':', ItemValues) - 1) as Question,
        substring(ItemValues, charindex(':', ItemValues) + 1, 999) as Answers
    FROM 
    (
        SELECT 
            ID
            ,
            TRIM(value) as ItemValues
            --ca1.ItemValues
        FROM 
            #StackQuestion
        CROSS APPLY STRING_SPLIT(Response,';') -- ca1(ItemValues)
    ) t1
    where
        charindex(':', ItemValues) >= 1
)
select
    answer = [0] + [1],
    comment = [2]
from (
    SELECT
        * 
    from prep
    cross apply OPENJSON(Answers)
) as answers
pivot ( 
    max([value]) 
    for [key] in ([0], [1], [2])
) piv

Output:

answer comment
Yes b1
No b2
Yes b3
No
Yes b5
No b6
Yes b7
No comment1
No c2
Yes c3
Yes c4
Yes b5
No b6
Yes b7
shortski
  • 68
  • 7