-1

i'm pretty new to the sql world and i've seem to run into a roadblock. Multiple conditions seems to be a regular question but I couldn't find a thread on my current problem. I'm using google data studio and have a dataset in google sheets that includes multiple terms and I want to know how many times each term shows up. Here's and example of my dataset and current written statement

Sample Dataset from the Column labeled "Answer"

  1. mini
  2. profanity
  3. mini, profanity
  4. mini, recorded
  5. credit bureau
  6. Suspicious
CASE
    WHEN CONTAINS_TEXT(Answer,"Credit Bureau") THEN "Credit" 
    WHEN CONTAINS_TEXT(Answer,"Mini") THEN "Mini"
    WHEN CONTAINS_TEXT(Answer,"Profanity") THEN "Profanity"
    WHEN CONTAINS_TEXT(Answer,"Bankruptcy") THEN "Bank"
    WHEN CONTAINS_TEXT(Answer,"Recorded") THEN "Recorded"
    WHEN CONTAINS_TEXT(Answer,"Suspicious") THEN "SAR"
    ELSE "Other" 
END

This statement works perfectly for cells that ONLY contain these terms. However the problem is that in the dataset there are multiple occurrences of the terms showing up in the same cell. Unfortunately the statement will only count the first term and ignore the others, giving me inaccurate totals. For example, in the mini dataset i provided, even though you see profanity twice, it would only be counted once. Does anyone know any potential workarounds or suggestions? Any help would be appreciated.

J Jo
  • 11
  • 2
  • Could you elaborate by providing a publicly editable Google Data Studio Report (additionally, a Google Sheet if it's the data set) of the scenario (using sample data that shows 1) Input values (~10 rows) 2) Expected output 3) An attempt at solving the issue)? It would help users visualise the issue and test out suggestions on a specific use case with objective right / wrong answers. Without a [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) it would be difficult to pinpoint a suggestion and the issue, e.g. Data Set, Data Source, Report, Fields, Chart – Nimantha Feb 14 '22 at 22:47

1 Answers1

0

Count them in separate columns for each category:

credit: case when contains_text(answer, 'credit bureau') then 1 else 0 end a
mini: case when contains_text(answer, 'mini') then 1 else 0 end),
profanity: case when contains_text(answer, 'profanity') then 1 else 0 end)
bank: case when contains_text(answer, 'bankruptcy') then 1 else 0 end)
recorded: case when contains_text(answer, 'recorded') then 1 else 0 end)
sar: case when contains_text(answer, 'suspicious') then 1 else 0 end)
other: case when not contains_text(answer, 'credit bureau')
      and not contains_text(answer, 'mini')
      and not contains_text(answer, 'profanity')
      and not contains_text(answer, 'bankruptcy')
      and not contains_text(answer, 'recorded')
      and not contains_text(answer, 'suspicious') then 1 else 0 end

Then sum each of those columns to get independent totals for each type.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Thank you for taking the time to respond!. Sorry for not providing all the details initially but with me using google data studio, "Select" does not seem to be a usable function in their case statements. Unless i'm missing something, I can only start it with "Case" and build out from there – J Jo Jan 11 '22 at 14:30
  • @JJo There is no “select in a case statement” in my code. You’ve tagged your question with `sql` and this is standard SQL. Did you try my query? – Bohemian Jan 11 '22 at 15:26
  • Hi, not sure if you misread or skipped over it but i mentioned that there is NOT a select option in google data studio. Your query doesn't work as written because GDS does not recognize any codes/formulas starting with "select". If you're not familiar with GDS, then thanks for your attempt to assist. I removed sql from my tags so i don't confuse anyone else going forward – J Jo Jan 11 '22 at 18:49
  • Nope still didn't work. Your query is correct (tested it on a different system) but GDS just doesn't have the right functionality to handle it. It gives me an error telling me that "multiple columns are not allowed". I messed around with it, and pretty much anytime "End" is added, it stops there. In other words, I'd have to create a new calculated field for each line of your written code because GDS can only handle one at a time. I tried multiple variations without the "end" in there, and it just doesn't look like GDS is capable of doing what I need it to. I appreciate all your help though – J Jo Jan 12 '22 at 20:27
  • @JJo yes, that's what I intended: you must create a new field for each `case`, and if you want the count, try wrapping each `case ... end` with `sum()`. What did you want if not a field for each category? Please edit you question with the actual result data you want. – Bohemian Jan 13 '22 at 02:37
  • Are you familiar with GDS? you can't put more than one `end` in a case statement in there. So with your query as is, GDS doesn't consider it a valid formula. Even this one line `case when contains_text(answer, 'credit bureau') then 1 else 0 end as credit` is not considered valid in there. I know how to calculate them separately. That's not the problem. The problem is you can't do one lump statement that will calculate all the correct totals in one chart without them excluding the terms that show up in the same cell, which is what i'm trying to do. – J Jo Jan 13 '22 at 20:35
  • @JJo I ask again: what would the single field value be that can show the totals of all 7 categories? Would it be JSON? Please explain. – Bohemian Jan 13 '22 at 22:05