2

I'm worried that users could execute malicious code in my Elasticsearch cluser, e.g. delete the index or bring down the server by executing expensive queries. According to this answer, the problem does exist in theory.

Our Elasticsearch cluster is only accessible from our dotnet backend server and we use the Elasticssearch NEST client to execute our queries. Currently, the user input is used unchanged in our queries. For example:

var result = await nestClient.SearchAsync<Product>(search => search
    .From(offset)
    .Size(limit)
    .Query(query => query
        .MultiMatch(multiMatch => multiMatch
            .Fields(fields => fields
                .Field(product => product.Name)
                .Field(product => product.Description))
            .Operator(Operator.Or)
            .Query(MALICIOUS_USER_INPUT)
        )
    )
);

I would expect that the NEST client (or the low level Elasticsearch.Net client) takes care of sanitizing the user input.

  • Is this assumption correct? (link to proving documentation or source code highly appreciated)
  • If the assumption is incorrect: What measures do I need to take to prevent users to inject malicious code in my Elasticsearch queries?
Faber
  • 1,504
  • 2
  • 13
  • 21
  • Why would you expect that any client would implicitly sanitise input? The client's responsibility is to execute provided input and return results. If you are letting arbitrary users execute arbitrary ES queries, it's your responsibility to ensure those queries aren't malicious. Most likely however, is that your design of allowing users to input raw ES queries is unnecessary and wrong. – Ian Kemp Oct 02 '20 at 10:22
  • Well, I thought that libraries used to connect to SQL databases do that (e.g. ```PreparedStatement``` in java), so I'm wondering how this can be done for Elasticsearch. And no, we don't let users execute _arbitrary_ queries on our Elasticsearch, but of course we need to use the search terms provided by the user in our prepared queries to get a useful search result (see example). – Faber Oct 02 '20 at 11:22
  • Prepared statements are merely a mechanism for substituting parameter values into a pre-sanitised SQL statement template defined by the programmer. The fact that the resultant statement appear to be sanitised is merely a side effect of the preparation coercing values to the parameters' declared datatypes. This is completely different from passing in and executing raw SQL, which is what you are effectively doing. – Ian Kemp Oct 02 '20 at 11:55
  • I will reiterate: almost certainly you should not be asking "how do I sanitise raw input", instead you should redesign your system so that it does not allow raw input. No matter how clever you try to be in sanitising, someone somewhere will find a way around it (hence why client libraries don't even bother to try). – Ian Kemp Oct 02 '20 at 11:58

0 Answers0