0

I created an API where an SQL statement can be passed as a parameter using JSON-RPC. The API provides a table and column names dictionary so actual database table and column names are not known to the client, and can be accessed using the GET method. The actual API functionality can be accessed using the POST method.

I use token, and regular expression validation to only allow requests with a valid token, allow only certain characters to prevent SQL injection, and prevented the use of "drop, insert, into, update". Only "select" statements are possible, but other statements can be allowed for specified users when needed.

My question is, is this a safe method of procedure call? If so, is it a common practice? I tried accessing the API using a browser and Postman, and it seems to work. I am primarily concerned with security though.

Below is a sample Postman JSON-RPC request body.

{"jsonrpc": "2.0", "method": "rpc_ql", "params": {"token" : "12345", "query" : "select person_id, person_name, person_gender from persons where person_id = 123"}, "id": 1}
/**
 * RPC-QL    - RPC-QL is a query language based on SQL and accessed
 *           - over remote procedure call (RPC) API.
 *
 * @package  RPC-QL (SQL on JSON-RPC)
 * @author   Raymund John Ang <raymund@open-nis.org>
 * @license  MIT License
 */

// Process POSTed JSON data using JSON-RPC 2.0 specification
$data = json_decode( file_get_contents('php://input'), true );
$jsonrpc = $data['jsonrpc'];
$method = $data['method'];
$params = $data['params'];
$id = $data['id'];

// Remote API function
function rpc_ql()
{

  global $jsonrpc;
  global $method;
  global $params;
  global $id;

  // Table and column fields dictionary
  $dictionary = [['Table' => 'persons',
                'Fields' => ['person_id' => 'integer - ID number of the person',
                  'person_name' => 'string - Name of the person',
                  'person_gender' => 'string - Male or Female',
                  'person_birth_date' => 'string - format YYYY-MM-DD']],
                ['Table' => 'places',
                'Fields' => ['place_id' => 'integer - ID number of the place',
                  'place_name' => 'string - Name of the place',
                  'place_state' => 'string - State where the place is located']]];

  // Show dictionary with GET request.
  if ( $_SERVER['REQUEST_METHOD'] == 'GET' ) echo json_encode($dictionary);

  // Access API functionality with POST request.
  if ( $_SERVER['REQUEST_METHOD'] == 'POST' ) {

    // Require parameters based on JSON-RPC 2.0 and SQL
    if ( empty($jsonrpc) || empty($method) || empty($params['token']) || empty($params['query']) || empty($id) ) exit('Please set "jsonrpc", "method", "token" and "query" parameters, and request "ID".');
    $token = $params['token'];
    $query = $params['query'];

    // Token validation - token should be 12345
    if ( ! hash_equals( hash('sha256', 12345), hash('sha256', $token) ) ) exit('Token authentication failed');

    // Query validation - query should be alphanumeric with a few accepted characters and blacklisted SQL commands.
    if ( preg_match('/^[a-zA-Z0-9 _,*=()\']+$/i', $query) && ! preg_match('/(drop|insert|into|update)/i', $query) ) {

      // Table and column fields converter
      $mod_query = str_replace( 'persons', 'db_persons', $query ); // Persons table; Initial conversion of $query.
      $mod_query = str_replace( 'person_id', 'db_person_id', $mod_query ); // Succeeding conversion of $mod_query.
      $mod_query = str_replace( 'person_name', 'db_person_name', $mod_query );
      $mod_query = str_replace( 'person_gender', 'db_person_gender', $mod_query );
      $mod_query = str_replace( 'person_birth_date', 'db_person_birthdate', $mod_query );
      $mod_query = str_replace( 'places', 'db_places', $mod_query ); // Places table
      $mod_query = str_replace( 'place_id', 'db_place_id', $mod_query );
      $mod_query = str_replace( 'place_name', 'db_place_name', $mod_query );
      $mod_query = str_replace( 'place_state', 'db_place_state', $mod_query );

      // Set $error to null if $mod_query contains a query.
      if ( $mod_query !== false ) $error = null;

      $response = ['jsonrpc' => $jsonrpc, 'result' => $mod_query, 'error' => $error, 'id' => $id];
      echo json_encode($response);

    } else {

      // Failed validation
      $error_message = "The query is not valid. It should only contain alphanumeric characters, spaces, '_', ',', '*', '=', '(', ')' and '''. Only SELECT statements are allowed.";
      $response = ['jsonrpc' => $jsonrpc, 'result' => null, 'error' => ['code' => 32600, 'message' => $error_message], 'id' => $id];
      echo json_encode($response);

    }

  }

}

// Execute method if function exists
if ( function_exists($method) ) {
  return $method();
} else {
  $error_message = 'Sorry. The included method does not exist.';
  $response = ['jsonrpc' => $jsonrpc, 'result' => null, 'error' => ['code' => 32602, 'message' => $error_message], 'id' => $id];
  echo json_encode($response);
}
Ray
  • 11
  • 4
  • 1
    I wouldn't say so. Unless you encrypt your traffic, anyone who intercepts your JSON is going to reveal all your DB tables and schema which is nothing but an information leak. If you use it between your frontend and backend (WAS) server that's fine but still very awkward – mangusta Aug 30 '19 at 20:40
  • Hi @mangusta. Traffic should be HTTPS. The table and column field names that will be used by the client are not the same as with the database. There is a converter that converts field names keyed in by the client to the actual database table and column names in the API server. The client only has access to the whitelist dictionary for reference, not the actual fields names in the database. – Ray Aug 31 '19 at 01:17

1 Answers1

2

In short: no.

Even if you make it as safe as you can, this approach would still present a security risk. A very simple example of how your current code will break:

truncate table persons

This will effectively delete all records in the table db_persons.

As long as you allow any input from the client to be directly included in the database queries, your system will be broken, sooner or later.

Sergey Kudriavtsev
  • 10,328
  • 4
  • 43
  • 68
  • Thanks @Sergey. I refactored the code and used whitelisting, instead of blacklisting. The client consuming the API can only use the terms in the dictionary now, and use query_data parameters to execute PHP PDO with ? as placeholder. The revised code can be accessed at: https://github.com/ray-ang/rpc-ql/blob/master/rpc-ql.php – Ray Aug 31 '19 at 01:13
  • You can use this JSON data in the request body of a POST request to test how the API will handle the request. The generated SQL statement and query_data parameters can then be integrated in a PDO statement. I hope there is better security in this method than the prior. `{"jsonrpc": "2.0", "method": "rpc_ql", "params": {"token" : "12345", "query" : "SELECT person_id, person_name, person_gender, person_birthdate FROM persons WHERE person_id LIKE '%?%' AND person_name LIKE '%?%'", "query_data" : [100, "ete"]}, "id": 1}` – Ray Aug 31 '19 at 05:37
  • Hi @Ray. I've checked your code and it does look better (meaning, I don't immediately see a way to break it). However, my point is that the more time and effort you invest in making your approach safe, the closer you will actually get to what the modern database drivers _already_ do with e.g. parametrized queries. So why spend double time reimplementing already existing features? – Sergey Kudriavtsev Sep 02 '19 at 08:31
  • Hi @Sergey. You would still need to use PDO and '?' as placeholder, specifically on line 110, and use $params[query_data] using JSON-RPC v2.0 to hold values from the client. Then change result in line 112 `'result' => $mod_query` to the actual result variable. I wanted to implement a query language, just like GraphQL, where the API client can safely use SQL statements rather than learning new statement conventions from GraphQL. That's my goal with the script. Thank you for highlighting the dangers of blacklist validation. – Ray Sep 03 '19 at 00:15
  • Hi @Ray, in this case - good luck with your project! If you have a plan to create a fully universal library for queries, I'd also advise to check the language parser generator tools, such as https://www.gnu.org/software/bison/ - they might allow you to create very sophisticated validation and parsing routines much easier. – Sergey Kudriavtsev Sep 03 '19 at 08:44
  • Hi @Sergey. Thank you for the link. I will explore it. – Ray Sep 04 '19 at 12:48