0

I am trying to implement full text search on AWS Neptune (engine 1.0.4.2) with AWS OpenSearch.

A GET amazon_neptune/_search on OpenSearch returns:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 100000,
      "relation": "gte"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "amazon_neptune",
        "_type": "_doc",
        "_id": "1234567890",
        "_score": 1.0,
        "_source": {
          "entity_id": "aeeA6GHI6ZvK4zOP",
          "document_type": "rdf-resource",
          "predicates": {
            "https://schema.org/description": [
              {
                "value": "value"
              }
            ]
          }
        }
      }
    ]
  }
}

Now, when trying to execute federated query using this SPARQL:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX neptune-fts: <http://aws.amazon.com/neptune/vocab/v01/services/fts#>
PREFIX hint: <http://aws.amazon.com/neptune/vocab/v01/QueryHints#>

SELECT ?res WHERE {
  SERVICE neptune-fts:search {
    neptune-fts:config neptune-fts:endpoint 'vpc-{url}.eu-central-1.es.amazonaws.com' .
    neptune-fts:config neptune-fts:queryType 'term' .
    neptune-fts:config neptune-fts:field 'Neptune#fts.document_type' .
    neptune-fts:config neptune-fts:query "rdf-resource" .
    neptune-fts:config neptune-fts:return ?res .
  }
}

...or Lucene queries like:

SELECT * WHERE {
  SERVICE neptune-fts:search {
    neptune-fts:config neptune-fts:endpoint 'https://vpc-{url}.eu-central-1.es.amazonaws.com' .
    neptune-fts:config neptune-fts:queryType 'simple_query_string' .
    neptune-fts:config neptune-fts:query "predicates.\\schema\\description.value:value" .
    neptune-fts:config neptune-fts:return ?res .
  }
}

No matter what query I use, I end up with this error:

{
  "code": "BadRequestException",
  "detailedMessage": "An IOException happened while fetching data from ES",
  "requestId": "{id}"
}

I have tried different variations of the SPARQL federated query, I always end up with the "An IOException happened while fetching data from ES".

So, what's going on here?

Thanks in advance.

Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38
realpac
  • 537
  • 6
  • 13

1 Answers1

0

This error message could indicate an incorrect configuration in networking, or IAM authentication.

When the SPARQL query runs, it attempts to federate a query to OpenSearch, it is at this point that the communication is failing.

Are you using IAM?

If so please ensure that all the relevant components have the correct roles associated in order to communicate, namely OpenSearch and Neptune. (https://docs.aws.amazon.com/neptune/latest/userguide/iam-auth.html)

One option would be to create a new stack following our cloud formation templates. This will ensure that all the relevant networking configuration is considered when creating the stack, so that you can test more easily: https://docs.aws.amazon.com/neptune/latest/userguide/full-text-search-cfn-create.html

Charles
  • 186
  • 1
  • 8
  • Thanks for the answer. IAM is disabled in both ES/OpenSearch and Neptune and correct policies are attached. I was able to `!curl https://vpc-{url}.eu-central-1.es.amazonaws.com` in the Jupyter Notebook (where I am also running the SPARQL queries). So this must mean that the connection is not the problem, right? Only federated queries to ES fail with IOException. – realpac May 06 '22 at 10:19
  • Turns out it was due to the missing security group egress rule to allow Neptune to ES. Adding that resolved the error. Thanks for the suggestions. – realpac May 09 '22 at 08:52