0

I'm trying to get the details of an organization based on the official website of the company using the below query. It constantly gets timed out. I require all the below fields. Is there a way to optimize it? I understand that OPTIONAL is equivalent to an INNER JOIN and is the cause of the timeouts but is there any other way of getting these fields?

I'm using the python api and setting a timeout of 5mins doesn't help either. The timeout value doesn't get set.

SELECT distinct
          (GROUP_CONCAT( DISTINCT ?official_name; separator=";") AS ?official_name) 
          (GROUP_CONCAT( DISTINCT ?isin; separator=";") AS ?isin) 
          ?item 
          ?itemLabel
          ?stock_exchange 
          ?stock_exchangeLabel
          (GROUP_CONCAT( DISTINCT ?ticker; separator=";") AS ?ticker)
          (GROUP_CONCAT( DISTINCT ?other_name; separator=";") AS ?other_name)
          (GROUP_CONCAT(DISTINCT ?parent_orgLabel; SEPARATOR = ";") AS ?parent_orgLabel) 
          (GROUP_CONCAT(DISTINCT ?owned_byLabel; SEPARATOR = ";") AS ?owned_byLabel) 
          (GROUP_CONCAT(DISTINCT ?instance_of; SEPARATOR = ";") AS ?instance_of)
          (GROUP_CONCAT(DISTINCT ?instance_ofLabel; SEPARATOR = ";") AS ?instance_ofLabel)
          (GROUP_CONCAT(DISTINCT ?domains; SEPARATOR = ";") AS ?domains)
          (GROUP_CONCAT(DISTINCT ?subsidiaryLabel; SEPARATOR = ";") AS ?subsidiaryLabel)
          (GROUP_CONCAT(DISTINCT ?owner_ofLabel; SEPARATOR = ";") AS ?owner_ofLabel)
          (GROUP_CONCAT(DISTINCT ?part_ofLabel; SEPARATOR = ";") AS ?part_ofLabel)
        WHERE {
          SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
          {
                    { ?item p:P856 [ ps:P856 <https://www.amazon.com> ]}}
          OPTIONAL {  
            ?item p:P856 ?web_domains.
            ?web_domains ps:P856 ?domains . 
          }    
          OPTIONAL { ?item wdt:P1448 ?official_name. }
          OPTIONAL { ?item wdt:P946 ?isin. }  
          OPTIONAL {  
            ?item p:P414 ?SE . 
            ?SE ps:P414 ?stock_exchange . 
            ?SE pq:P249 ?ticker .
           } 
          OPTIONAL { ?item skos:altLabel ?other_name. FILTER (LANG (?other_name) = "en") }
          OPTIONAL {
            ?item wdt:P361 ?part_of.
            ?part_of rdfs:label ?part_ofLabel. 
            filter(lang(?part_ofLabel)="en")
          }

          OPTIONAL {
            ?item wdt:P749 ?parent_org.
            ?parent_org rdfs:label ?parent_orgLabel. 
            filter(lang(?parent_orgLabel)="en")
          }
          OPTIONAL {
            ?item wdt:P127 ?owned_by.
            ?owned_by rdfs:label ?owned_byLabel. 
            filter(lang(?owned_byLabel)="en")
          }        
          OPTIONAL {
            ?item wdt:P31 ?instance_of.
            ?instance_of rdfs:label ?instance_ofLabel. 
            filter(lang(?instance_ofLabel)="en")
          }
          OPTIONAL {
            ?item wdt:P355 ?subsidiary.
            ?subsidiary rdfs:label ?subsidiaryLabel. 
            filter(lang(?subsidiaryLabel)="en")
          }
          OPTIONAL {
            ?item wdt:P1830 ?owner_of.
            ?owner_of rdfs:label ?owner_ofLabel. 
            filter(lang(?owner_ofLabel)="en")
          }
        }
        GROUP BY ?item ?itemLabel ?stock_exchange ?stock_exchangeLabel
Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Srini
  • 53
  • 8
  • *" I understand that OPTIONAL is equivalent to an INNER JOIN"* - that's wrong. It's basically a left-outer join. But yes, it's expensive obviously – UninformedUser Mar 01 '19 at 13:26
  • Sorry for the mistake. Yes, LEFT OUTER JOIN. – Srini Mar 01 '19 at 13:30
  • I don't see any room for optimization without changing the semantics of the query. You can check the query execution plan to see why it takes so much time. you could also check the Blazegraph query optimizer options, but I don't see what could force the optimizer to perform better here. Don't forget, the public endpoint is a shared medium, You could easily load the data into your local triple store. – UninformedUser Mar 01 '19 at 13:30
  • Here the docs for the Blazegraph optimizer stuff: https://wiki.blazegraph.com/wiki/index.php/QueryOptimization – UninformedUser Mar 01 '19 at 13:31
  • Any pointers on how to load data into local triple store works? – Srini Mar 01 '19 at 13:41
  • If I don't use `SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }`, the query doesn't timeout. Is that good enough? – rickhg12hs Mar 04 '19 at 01:20

0 Answers0