0

I am working on an application whose performance is critical.

In this application I have a lot of messages(i.e. several thousands) needed to be signed (and verified of course) separately with a same private key/public key. I am using the OpenSSL library.

A naive approach with DSA functions (see below) will take tens of seconds to sign which is not nice. I tried to useDSA_sign_setup() function to speed things up but I can't figure out the correct way to use it.

I also tried ECDSA but I am lost in getting the correct configuration.

What is the proper way to do this if I really care about efficiency?

#include <openssl/dsa.h>
#include <openssl/engine.h>
#include <stdio.h>
#include <openssl/evp.h>

int N=3000;

int main()
{
    DSA *set=DSA_new();
    int a;
    a=DSA_generate_parameters_ex(set,1024,NULL,1,NULL,NULL,NULL);
    printf("%d\n",a);
    a=DSA_generate_key(set);
    printf("%d\n",a);
    unsigned char msg[]="I am watching you!I am watching you!";
    unsigned char sign[256];
    unsigned int size;
    for(int i=0;i<N;i++)
        a=DSA_sign(1,msg,32,sign,&size,set);
    printf("%d %d\n",a,size);
}
Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
Ruiyu Zhu
  • 71
  • 8
  • 1
    Do you really need to generate a new key-pair for every message? What about just once per recipient at the beginning of the connection? Couldn't the keys also persist between sessions? – Galik Jun 07 '18 at 21:46
  • I can use the same key-pair for all the messages. It is said that "DSA_sign_setup" can be used to speed up but I can't follow the instruction about its usage. Manual: https://www.openssl.org/docs/man1.0.2/crypto/DSA_sign.html – Ruiyu Zhu Jun 07 '18 at 22:14
  • How long are the messages? – user207421 Jun 08 '18 at 00:07
  • The length varies. But it is sufficient to sign the hashes(256 bits). – Ruiyu Zhu Jun 08 '18 at 00:21
  • Varies from what to what? If they are large, it is customary to secure-hash them and sign the hash. This is a lot quicker. – user207421 Jun 08 '18 at 02:07
  • From several KB to several MB. Thus I think sign the hash is the correct approach. – Ruiyu Zhu Jun 08 '18 at 02:37

3 Answers3

2

I have decided to delete this answer because it compromises the efforts of the OpenSSL team to make their software safe.

The code I posted is still visible if you look at the edit but DO NOT USE IT, IT IS NOT SAFE. If you do, you risk exposing your private key.

Please don't say you haven't been warned. In fact, treat it as a warning if you are using DSA_sign_setup() in your own code, because you shouldn't be. Romen's answer above has more details about this. Thank you.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
  • Thanks! I tried this approach, and it says "error: member access into incomplete type 'DSA' (aka 'dsa_st') set->kinv = kinv;". This seems to be something about openssl https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857621 – Ruiyu Zhu Jun 08 '18 at 16:01
  • Let me look, you probably just need to #include some header file or other. – Paul Sanders Jun 08 '18 at 16:27
  • https://stackoverflow.com/questions/40549318/error-invalid-use-of-incomplete-type-rsa-aka-struct-rsa-st-in-openssl-1-1 This post brought up a similar issue. – Ruiyu Zhu Jun 08 '18 at 16:36
  • How tedious. Still looking. – Paul Sanders Jun 08 '18 at 16:39
  • Found it! It's defined in `openssl/crypto/dsa/dsa_locl.h`. Try #including that, see if it works. If so, let me know and I'll update my answer. If it throws up a bunch of new compiler errors we can probably sort that out somehow. – Paul Sanders Jun 08 '18 at 16:52
  • It doesn't seem to help. Error message: "no member named 'kinv' in 'dsa_st'". Is [this](https://github.com/openssl/openssl/blob/master/crypto/dsa/dsa_locl.h) the file you are looking at? I begin to suspect that DSA_sign_setup is outdated... – Ruiyu Zhu Jun 08 '18 at 19:38
  • I directly include that file with full path. Otherwise it says the file is not found. – Ruiyu Zhu Jun 08 '18 at 19:40
  • Sorry, I only glanced inside the file. I should have looked for those members explicitly. And yes, you might be right, maybe it is. I'll take another look tomorrow. – Paul Sanders Jun 08 '18 at 19:42
  • 1
    @Ruiyu I believe I have found a solution that will work for you, provided you are willing to back to OpenSSL 1.0.2. I have updated my answer accordingly, hope it helps. No need to `#include openssl/crypto/dsa/dsa_locl.h` now, that was probably always a bad idea! – Paul Sanders Jun 09 '18 at 13:17
  • I got it! Thank you. `DSA_sign_setup` is not maintained in the update. That saves me a day! – Ruiyu Zhu Jun 10 '18 at 18:24
  • Yeah, they dropped the ball. I will let them know. Thx for vote, appreciated. – Paul Sanders Jun 10 '18 at 18:26
  • Or maybe not, seems this change was deliberate, see my edit to my post, shows what I know. But they certainly need a better documentation department. – Paul Sanders Jun 11 '18 at 22:39
  • I see the thing. Thanks for updating. – Ruiyu Zhu Jun 12 '18 at 18:18
  • No problem. How does the code @romen posted perform compared to your original (and what platform are you on)? – Paul Sanders Jun 12 '18 at 22:14
  • His code sign 3000 times in almost no time(0.175s to be precise). My original code takes 9s to run on my macbook air. romen's code, however, can't compile on macbook for some reason. Surprisingly, the performance of my origin code increases dramatically(running time shrinks to 1s) when I move to an Lunix server. It might be something about OS that affects the performance. – Ruiyu Zhu Jun 12 '18 at 22:38
2

Using DSA_sign_setup() in the way proposed above is actually completely insecure, and luckily OpenSSL developers made the DSA structure opaque so that developers cannot try to force their way.

DSA_sign_setup() generates a new random nonce (that is sort of an ephemeral key per signature). It should never be reused under the same long term secret key. Never.

You could theoretically still be relatively safe reusing the same nonce for the same message, but as soon as the same combination of private key and nonce gets reused for two different messages you just reveal all the information that an attacker needs to retrieve your secret key (see Sony fail0verflow which is basically due to doing the same mistake of reusing the nonce with ECDSA).

Unfortunately DSA is slow, especially now that longer keys are required: to speed up your application you could try using ECDSA (e.g. with curve NISTP256, still no nonce reuse) or Ed25519 (deterministic nonce).


Proof of concept using the EVP_DigestSign API

Update: here is a proof of concept of how to programmatically generate signatures with OpenSSL. The preferred way is to use the EVP_DigestSign API as it abstracts away which kind of asymmetric key is being used.

The following example expands the PoC in this OpenSSL wiki page: I tested this works using a DSA or NIST P-256 private key, with OpenSSL 1.0.2, 1.1.0 and 1.1.1-pre6.

#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <openssl/pem.h>
#include <openssl/err.h>
#include <openssl/evp.h>

#define KEYFILE "private_key.pem"
#define N 3000
#define BUFFSIZE 80

EVP_PKEY *read_secret_key_from_file(const char * fname)
{
    EVP_PKEY *key = NULL;
    FILE *fp = fopen(fname, "r");
    if(!fp) {
        perror(fname); return NULL;
    }
    key = PEM_read_PrivateKey(fp, NULL, NULL, NULL);
    fclose(fp);
    return key;
}

int do_sign(EVP_PKEY *key, const unsigned char *msg, const size_t mlen,
            unsigned char **sig, size_t *slen)
{
    EVP_MD_CTX *mdctx = NULL;
    int ret = 0;

    /* Create the Message Digest Context */
    if(!(mdctx = EVP_MD_CTX_create())) goto err;

    /* Initialise the DigestSign operation - SHA-256 has been selected
     * as the message digest function in this example */
    if(1 != EVP_DigestSignInit(mdctx, NULL, EVP_sha256(), NULL, key))
        goto err;

    /* Call update with the message */
    if(1 != EVP_DigestSignUpdate(mdctx, msg, mlen)) goto err;

    /* Finalise the DigestSign operation */
    /* First call EVP_DigestSignFinal with a NULL sig parameter to
     * obtain the length of the signature. Length is returned in slen */
    if(1 != EVP_DigestSignFinal(mdctx, NULL, slen)) goto err;
    /* Allocate memory for the signature based on size in slen */
    if(!(*sig = OPENSSL_malloc(*slen))) goto err;
    /* Obtain the signature */
    if(1 != EVP_DigestSignFinal(mdctx, *sig, slen)) goto err;

    /* Success */
    ret = 1;

err:
    if(ret != 1)
    {
        /* Do some error handling */
    }

    /* Clean up */
    if(*sig && !ret) OPENSSL_free(*sig);
    if(mdctx) EVP_MD_CTX_destroy(mdctx);

    return ret;
}

int main()
{
    int ret = EXIT_FAILURE;
    const char *str = "I am watching you!I am watching you!";
    unsigned char *sig = NULL;
    size_t slen = 0;
    unsigned char msg[BUFFSIZE];
    size_t mlen = 0;

    EVP_PKEY *key = read_secret_key_from_file(KEYFILE);
    if(!key) goto err;

    for(int i=0;i<N;i++) {
        if ( snprintf((char *)msg, BUFFSIZE, "%s %d", str, i+1) < 0 )
            goto err;
        mlen = strlen((const char*)msg);
        if (!do_sign(key, msg, mlen, &sig, &slen)) goto err;
        OPENSSL_free(sig); sig = NULL;
        printf("\"%s\" -> siglen=%lu\n", msg, slen);
    }

    printf("DONE\n");
    ret = EXIT_SUCCESS;

err:
    if (ret != EXIT_SUCCESS) {
        ERR_print_errors_fp(stderr);
        fprintf(stderr, "Something broke!\n");
    }

    if (key)
        EVP_PKEY_free(key);

    exit(ret);
}

Generating a key:

# Generate a new NIST P-256 private key
openssl ecparam -genkey -name prime256v1 -noout -out private_key.pem

Performance/Randomness

I ran both your original example and my code on my (Intel Skylake) machine and on a Raspberry Pi 3. In both cases your original example does not take tens of seconds. Given that apparently you see a huge performance improvement in using the insecure DSA_sign_setup() approach in OpenSSL 1.0.2 (which internally consumes new randomness, in addition to some somewhat expensive modular arithmetic), I suspect you might actually have a problem with the PRNG that is slowing down the generation of new random nonces and has a bigger impact than the modular arithmetic operations. If that's the case you might definitely benefit from using Ed25519 as in that case the nonce is deterministic rather than random (it's generated using secure hash functions and combining the private key and the message). Unfortunately that means that you will need to wait until OpenSSL 1.1.1 is released (hopefully during this summer).

On Ed25519

To use Ed25519 (which will be supported natively starting with OpenSSL 1.1.1) the above example needs to be modified, as in OpenSSL 1.1.1 there is no support for Ed25519ph and instead of using the Init/Update/Final streaming API you would need to call the one-shot EVP_DigestSign() interface (see documentation).

Full disclaimer: the next paragraph is a shameless plug for my libsuola research project, as I could definitely benefit from testing for real-world applications from other users.

Alternatively, if you cannot wait, I am the developer of an OpenSSL ENGINE called libsuola that adds support for Ed25519 in OpenSSL 1.0.2, 1.1.0 (and also 1.1.1 using alternative implementations). It's still experimental, but it uses third-party implementations (libsodium, HACL*, donna) for the crypto part and so far my testing (for research purposes) has not yet revealed outstanding bugs.

Benchmarking comparison of OP original example and mine

To address some of the comments, I compiled and executed OP's original example, a slightly modified version fixing some bugs and memory leaks, and my example of how to use the EVP_DigestSign API, all compiled against OpenSSL 1.1.0h (compiled as a shared library to a custom prefix from the release tarball with default configuration parameters).

The full details can be found at this gist, which includes the exact versions I benchmarked, the Makefile containing all the details on how the examples where compiled and how the benchmark was run, and details about my machine (briefly it's a quad-core i5-6500 @ 3.20GHz, and freq scaling/Turbo boost is disabled from software and from the UEFI).

As can be seen from make_output.txt:

Running ./op_example
time ./op_example >/dev/null
0.32user 0.00system 0:00.32elapsed 100%CPU (0avgtext+0avgdata 3452maxresident)k
0inputs+0outputs (0major+153minor)pagefaults 0swaps

Running ./dsa_example
time ./dsa_example >/dev/null
0.42user 0.00system 0:00.42elapsed 100%CPU (0avgtext+0avgdata 3404maxresident)k
0inputs+0outputs (0major+153minor)pagefaults 0swaps

Running ./evp_example
time ./evp_example >/dev/null
0.12user 0.00system 0:00.12elapsed 99%CPU (0avgtext+0avgdata 3764maxresident)k
0inputs+0outputs (0major+157minor)pagefaults 0swaps

This shows that using ECDSA over NIST P-256 through the EVP_DigestSign API is 2.66x faster than the original OP's example and 3.5x faster than the corrected version.

As a late additional note, the code in this answer also computes the SHA256 digest of the input plaintext, while OP's original code and the "fixed" version skip it. Therefore the speedup demonstrated by the ratios reported above is even more significant!


TL;DR: The proper way to efficiently use digital signatures in OpenSSL is through the EVP_DigestSign API: trying to use DSA_sign_setup() in the way proposed above is ineffective in OpenSSL 1.1.0 and 1.1.1, and is wrong (as in completely breaking the security of DSA and revealing the private key) in ≤1.0.2. I completely agree that the DSA API documentation is misleading and should be fixed; unfortunately the function DSA_sign_setup() cannot be completely removed as minor releases must retain binary compatibility, hence the symbol needs to stay there even for the upcoming 1.1.1 release (but is a good candidate for removal in the next major release).

romen
  • 36
  • 5
  • thanks very much for taking the time and trouble to post this. I'm sure the OP will be interested in what you have to say and your example code (which seems to work fine). I will update my answer. Regarding run times, my original benchmarks were with a debug build as I was just interested in the potential speedup offerred by `DSA_sign_setup()`. I benchmarked both the code above and the original code posted by the OP again in a release build (just wall clock time) and the OPs code is about 3 times faster. I am running on Wndows, I wouldn't know if that has PRNG implications. – Paul Sanders Jun 11 '18 at 12:26
  • I guess the slowdown you are observing comes from the extra `printf`s for each signature and because my code is allocating memory for each signature using malloc rather than reusing the same buffer. I will also soon update the code above as I'm never actually calling free on the returned signatures! – romen Jun 11 '18 at 18:12
  • Don't think so. I took the `printf` out and `malloc` isn`t _that_ expensive :) Can you run a benchmark on Windows? Maybe it's that. – Paul Sanders Jun 11 '18 at 18:26
  • Also, I have edited my post as you guys said I should and posted back to that bug report, just FYI. – Paul Sanders Jun 11 '18 at 22:53
  • @PaulSanders I updated the answer to include benchmark data, I actually see the inverse of what you report (mine is ~3x faster than OP's). I don't have access to a Windows machine for testing, and without reproducing your results I cannot really pinpoint why you see that difference. I'm also in the process of creating pull requests to amend the DSA documentation in 1.1.1, 1.1.0 and 1.0.2, and I'll open an issue to suggest the removal of `DSA_sign_setup` as a public function for the next major release. – romen Jun 12 '18 at 01:07
  • OK, cool. I'll repeat my benchmarks tomorrow to see if I made a mistake and take a closer look at the profiler output to see if I can see where the time is going. I'll let you know if I discover anything useful. – Paul Sanders Jun 12 '18 at 02:00
  • I ran the original code on my Mac and it doest takes roughly 10 seconds. The example code you posted helps a lot. – Ruiyu Zhu Jun 12 '18 at 18:27
  • @RuiyuZhu I'm glad to hear it was useful. I'm actually curious about your timings as ~10 seconds to run your original code seems too much: have you tried comparing it with the timing to run my example? Also I will update the answer, in the benchmark section, to add a note about the fact that the speedup using ECDSA through the EVP interface over the original code is even more remarkable than a mere 3x as the example code in my answer also computes the SHA256 digest of the plaintext message before signing, while the original code skipped that part. – romen Jun 12 '18 at 19:00
  • @romen I repeated my Windows benchmarks and got the same results. More details [here](https://github.com/openssl/openssl/issues/6478). Saw your tl;dr, I have deleted my answer. – Paul Sanders Jun 13 '18 at 09:32
  • @romen After discussing this with OpenSSL member dot-asm on Github, I rebuilt my OpenSSL libraries and now I get the same timings as you. I have no idea what was wrong with the first set I built since I followed the same procedure but the problem is now resolved. Sorry if this ate up a lot of your time. – Paul Sanders Jun 13 '18 at 14:15
  • @PaulSanders thank you for coming back on the matter and spending time to understand the issue. I really think your answer and the effort you put into this was highly beneficial for everyone: I definitely think OpenSSL documentation has a lot of space for improvement (and I wish I had the time and the expertise on all the different parts to actually work on improving it) and that given the way it was written, it was only natural for developers not well-versed in cryptography to use `DSA_sign_setup` to precompute once and than generate a batch of signatures. – romen Jun 13 '18 at 14:39
  • Also probably it is worth to propose OpenSSL to add disclaimers in the documentation for low level API (`DSA_*`, `ECDSA_*`, `RSA_*`), at least for the signature related functions, pointing developers toward using the `EVP_*` API instead, as that is the intended preferred access point for applications, but empirical evidence suggests that the available documentation fails to effectively communicate this to external developers. – romen Jun 13 '18 at 14:43
  • @PaulSanders is it ok if I edit my answer to explicitly mention your answer when pointing at the `DSA_sign_setup()` example? Of course I wouldn't be doing that for "shaming"/"blaming" it (I wouldn't anyway!!) but because I think it is highly informative for whoever is interested in OP's question: it clearly shows what even experienced developers might end up doing (actually misguided by the wording of the existing documentation) and how it completely breaks the security of the cryptosystem. I ask because I'm quite new on SO, and am not entirely familiar with the local netiquette. – romen Jun 13 '18 at 14:51
  • @romen Yes that's fine. You can get a link to it by clicking 'share' just underneath the body of the text there. Feel free to edit it if you want to add or change anything (your edit will be peer-reviewed and almost certainly approved). – Paul Sanders Jun 13 '18 at 15:47
  • @romen The interesting thing is that your code doesn't compile on my Macbook for some reason. Thus I can't verify its performance on OSX. However when I move to a Lunix platform, the performance of my code and yours agrees with what you posted. I guess this might be something about the OS. – Ruiyu Zhu Jun 14 '18 at 14:15
  • @romen compile error: Undefined symbols for architecture x86_64: "_EVP_MD_CTX_free", referenced from: do_sign(evp_pkey_st*, unsigned char const*, unsigned long, unsigned char**, unsigned long*) in test-2ba0da.o "_EVP_MD_CTX_new", referenced from: do_sign(evp_pkey_st*, unsigned char const*, unsigned long, unsigned char**, unsigned long*) in test-2ba0da.o ld: symbol(s) not found for architecture x86_64 – Ruiyu Zhu Jun 14 '18 at 14:15
  • @Ruiyu that's strange, looks like the linker cannot find those symbols in the library, even though if you reached the linking stage they were defined in the headers. – romen Jun 14 '18 at 17:30
  • @Ruiyu check the arguments you are passing to the compiler/linker, it's possible you are using headers that don't match the version of the library you're linking against! – romen Jun 14 '18 at 17:32
  • The same code/compile command works on Lunix but not OS X. Openssl versions are the same. I am not sure what's wrong. I used `gcc test.cpp -lcrypto` to compile. – Ruiyu Zhu Jun 15 '18 at 17:37
  • Openssl version:1.0.2 – Ruiyu Zhu Jun 15 '18 at 17:44
0

If the messages are large, it is customary to secure-hash them and sign the hash. This is a lot quicker. You need to transmit the message, the hash, and the signature, of course, and the checking process must include both re-hashing and checking that for equality, and digital signature verification.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • 1
    I am already signing hash-size messages(see the code). But it is still very slow( hundreds of signs per second, it will be good if it could speed up 1 magnitude of order). Am I doing something wrong? – Ruiyu Zhu Jun 08 '18 at 03:21
  • When I asked you how large the messages were, you told me, and I quote, 'from several KB to several MB', and you didn't say anything about already signing a hash, or a hash-sized message. Nor is there anything about it in your code. Please make up your mind. – user207421 Jun 08 '18 at 03:39
  • 1
    Sorry I haven't make everything that clear. In the example code I posted, the message was only several bytes. However the performance is still not good. The core of the question is how to speed signature itself(especially on short messages). – Ruiyu Zhu Jun 08 '18 at 15:37