0

I would like an HTTP request to originate from outside of AWS and reach a Neptune Graph.

Since a Neptune Graph can exist only within a VPC (and within a private subnet, at that), I'm exploring options for non-AWS traffic to reach the graph. The intent is not to circumvent security or create data exposure risks, but rather to find one (or more) functional ways that a query can originate from a non-AWS resource, be evaluated by Neptune, and have response data transferred back to the client.

This was a promising lead, but looks to be outdated – it was last updated 3 years ago, and does not align with things I see in AWS console today. I'm unable to create either an ALB or NLB directly in front of a Neptune endpoint (specifically, the "target group" is the point where next step is unclear).

There are several potential ways forward, including:

  • create a Lambda function in front of the graph (possibly following an architecture like this)
  • create an Elastic IP and associate somehow to Neptune, then create an NLB or ALB with a target group tied to that IP
  • create an EC2 instance running HAProxy, then NLB or ALB routing traffic to that EC2 instance (which in turn routes to the graph)

I'm not sure which of those might be better or worse (and for what reasons), or if there are other solutions that might work as well (or better).

So, how can this work?

EDIT: after Kelvin raising this in their answer below, I'll add that actual traffic would likely be in the form of non-Lambda cloud functions (such as Cloudflare Workers). No ad hoc queries allowed from clients, but instead pre-defined queries wrapped in a deployed function, with optional parameters.

Kaan
  • 5,434
  • 3
  • 19
  • 41

1 Answers1

1

Many of the approaches that you mentioned are commonly used (ALB, NLB, Lambda). You could also create a REST API using API Gateway (that invokes a Lambda that talks to Neptune) or create a GraphQL API using AppSync. In total there are probably more than a dozen ways that you can establish such a connection.

In part, it depends on where your clients will be. Public Internet vs from inside an enterprise for example. It also depends on whether you wish to allow people to send actual queries against the database or you wish to provide an API that users call that prevents the issuing of direct queries.

Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38
  • Thanks for the answer. Actual traffic would likely be in the form of non-Lambda cloud functions (such as Cloudflare Workers). No ad hoc queries allowed from clients, but instead pre-defined queries wrapped in a deployed function, with optional parameters. So many options.. tough to evaluate how best to approach this. – Kaan May 25 '22 at 00:41
  • 1
    Exposing through an ALB/NLB is likely the easiest method as it will allow all traffic to flow to the Neptune cluster. With that, you'll likely want to enable IAM authentication on the Neptune cluster to avoid anyone else gaining access. The API Gateway or AppSync approach give you more control. With API Gateway, you can directly map a route to one of the Neptune API endpoints (/gremlin, /opencypher, /sparql, /loader, etc.) or you can use API/Lambda to create custom APIs. API Gateway offers things like throttling and parameter overrides as well. – Taylor Riggan May 25 '22 at 12:54
  • Thanks Taylor. API Gateway sounds worth exploring as an alternative to a load balancer, added to the list. – Kaan May 25 '22 at 22:53
  • This feels like a dumb question, but.. is it (still?) possible to create either an ALB, NLB or API Gateway in front of Neptune? Perhaps it was possible, or at least easier, at some point in the past? I've found no way to reach the graph other than EC2 instance or Lambda function. If actually possible for NLB/ALB or API Gateway, there's some mix of "user error" on my part, or a highly complex problem domain - concerning either way. This many obstacles gives me pause to pursue Neptune usage any further. – Kaan May 26 '22 at 23:28
  • Yes, you create a Target Group assigned to the NLB/ALB with the IP address of the cluster endpoint. To get the initial IP address of the endpoint, just ping the endpoint to see what IP it resolves to. The reference architectures that we have on GitHub also show a Lambda function that keeps the Target Group up-to-date with the latest IP address. The alternative is to use HAproxy (or similar) as a proxy that can take the DNS endpoints for Neptune. – Taylor Riggan May 27 '22 at 15:35
  • All the options discussed remain valid and should be quite simple to configure unless there is something specific in place (within your account/organization) that is blocking some access paths. Are you able to open an AWS support case? That would allow us to work with you more directly. If you are not able to open a case we can still help here. – Kelvin Lawrence May 27 '22 at 15:36
  • Thank you both for the suggestions. I'm pausing this effort. My overall intent was to "test drive" Neptune in comparison with other graph databases. However, instead of spending time with the graph itself, my time was spent on various attempts to simply access the graph, as well as reading+learning about plenty of non-graph things (NLBs, ALBs, API Gateways, VPCs, subnets, target groups, etc). If this were a car, it would be like requiring customers to understand the role of the camshaft, cams, timing, octane, etc before taking a test drive. – Kaan May 31 '22 at 17:11
  • Hello, did you create an HTTP API in API Gateway with integration to NLB on target group on port 8182? When I try to access it from outside, the message is always "Service Unavailable". The Security group of Neptune is allowing 8182 from SG of NLB. I created a route /sparql on method ANY and added an integration of NLB with a VPCLink. Can you please suggest what can be wrong here? – Nisarg Aug 14 '23 at 18:28
  • Typically if you use an ALB/NLB you would not also need API GW unless you want to "hide" the DB behind your own API. The ALB/NLB would route the request to the Neptune endpoint. Perhaps start a new question with any additional requirements of your use case. – Kelvin Lawrence Aug 14 '23 at 20:34