0

I'm writing a scripted calculation view on HANA using SQL. Looking for some performance booster alternatives for the logic that I have implemented in a while loop. Simplified version of code is below.

It is trying to get similar looking vendors in table B for vendors from table A.

Please bear with me for inaccurate syntax.

v = select vendor, vendorname from A;

while --set a counter here
    vendorname = capture the record from v for row number represented by counter here
    t = select vendor, vendorname from v where (read single vendor from counter row)
        union all
        select vendor, vendorname from B where contains(vendorname,:vendorname,fuzzy(0.3))
        union all
        select vendor, vendorname from t
endwhile

This query dies when there are thousands of records in both the tables. So after reading few blogs, I realized that I'm going in wrong direction that is using loop.

To make this little faster, I came across something called CTE.

When I tried to implement the same code using CTE I'm not allowed to do so. Sample code I'm trying to write is below. Can anybody please help me get this right? The syntax is not accepted by system.

        t = with mytab ("Vendor", "VendorName")     
             AS ( select "Vendor", "VendorName" from "A" WHERE ( "Updated_Date" >= :From_Date AND "Updated_Date"  <= :To_Date ) )
             select * from "B" WHERE CONTAINS ("VendorName", mytab."VendorName",FUZZY(0.3));

The SQL error for this syntax is: SQL: invalid identifier: MYTAB

I would like to know:

  1. Whether such operation with CTE is allowed. If yes, what is the correct syntax in HANA SQL?

  2. If No, how do I achieve the desired result without looping through one table?

Thanks,

Anup

Anup Kulkarni
  • 13
  • 1
  • 6

1 Answers1

2

CTE's are allowed in SAP HANA - you might want to check the HANA SQL reference if you're looking for syntax.

But as you're in a SQLScript context anyhow, you might as well use table variables instead.

What I'm not sure about is, what you are actually trying to do. Provide a description of your usage scenario, if possible.


Ok, based on your comments, the following approach could work for you. Note, in my example I use a copy of the USERS system table, so you will have to fit the query to your tables.

do
begin
declare user_names nvarchar(5000);

    select string_agg(user_name,' ') into user_names 
    from cusers
    where user_name in ('SYS', 'SYSTEM');

    select * 
    from cusers
    where contains (user_name, :user_names, fuzzy(0.3));

end;    

What I do here is to get all the potential names for which I want to do a fuzzy lookup into a variable user_names (separated by a space). For this I use the STRING_AGG() aggregation function.
After the first statement is finished, :user_names contains SYSTEM SYS in my example. Now, CONTAINS allows to search multiple columns for multiple search terms at once (you may want to re-check the reference documentation for details here), so

 CONTAINS (<column_name>, 'term1 term2 term3') 

looks for all three terms in the column .

With that we feed the string SYS SYSTEM into the second query and the CONTAINS clause.

That works fine for me, avoids a join and runs over the table to be searched only once.

BTW: no idea where you get that statement about table variables in read-only procedures from - it's wrong. Of course you can use table variables, in fact it's recommended to make use of them.

Lars Br.
  • 9,949
  • 2
  • 15
  • 29
  • Thank you Lars. In the new query,**mytab."VendorName"** is not correct syntax. I would like to know whether CTEs allow fields of mytab in where condition. Functionally, all I am looking for is a vlookup from table A into table B through a fuzzy search (which means more than one result in table B for each entry in table A) and map the results from table B against each entry in table A. Can you please advice how I can do it without while loop? – Anup Kulkarni Apr 06 '17 at 09:42
  • And as far as *table variables* are concerned, these cannot be used in 'read-only' procedures like calculation views. Even with **table variables**, I doubt whether I can fulfil my scenario without using loops. So what is the best way to improve performance? – Anup Kulkarni Apr 06 '17 at 13:17
  • Thanks, this helps partially but with this approach I cannot map search result to a particular user. Let's say there are two results (SY and SYST) matching SYS, and one result (SYSTEMS) matching SYSTEM, all I get with this query is SY, SYST and SYSTEMS. But I can't map which one matches with which one unless I loop through the users in **:user_names**. – Anup Kulkarni Apr 07 '17 at 09:27
  • Regarding my comments about table variables, I confused it with 'temporary table'. **CREATE TEMPORARY TABLE** is surely not allowed in calculation views. Sorry for that. – Anup Kulkarni Apr 07 '17 at 09:30
  • So, now you've added a requirement: matching the search results against the different search terms. This is obviously more complex as search results can relate to multiple search terms. For the SQL interface of the text search there is no "why found" functionality available, so you might end up having to loop over your search terms. – Lars Br. Apr 10 '17 at 05:05
  • Thanks a lot Lars. – Anup Kulkarni Apr 10 '17 at 08:06