0

I am developing a program that copy string. And I checked the performance to compare with glibc. I downloaded source for glibc with this command:

apt-get source glibc

I compare with following code.

  1. /glibc-2.19/string/strcpy.c
  2. #include<string> and use strcpy()

It must be similar performance, I expected... However, as a result, the performance were totally different.

I tried some type of optimize option for gcc such as O1 O2 O3 but the result is similar.

Is there some kind of magic to get more speed? I hope to know the reason.

Here is the code

// test for performance.

/******************************************************************************/

#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stddef.h>


/******************************************************************************/
char *
strcpy_glibc (dest, src)
     char *dest;
     const char *src;
{
  char c;
  char *s = (char *) src;
  const ptrdiff_t off = dest - s - 1;

  do
    {
      c = *s++;
      s[off] = c;
    }
  while (c != '\0');

  return dest;
}

/******************************************************************************/
void test(int iLoop, int iLen,
    char *szFuncName, char*(*func)(char *s1, const char *s2))
{
    time_t          tm1, tm2;
    int             i;
    char   s1[512];
    char   s2[512];

    // initialize the test string.
    for(i = 0; i < iLen; i++) {
        s1[i] = '@';
    }
    s1[iLen] = '\0';

    /**************************************************************************/
    printf("test(): %s() started, iLoop = %d, iLen = %d.\n",
        szFuncName, iLoop, iLen);

    tm1 = time(NULL);

    for(i = 0; i < iLoop; i++) {
        func(s2, s1);
        func(s1, s2);
        func(s2, s1);
        func(s1, s2);
        func(s2, s1);

        func(s1, s2);
        func(s2, s1);
        func(s1, s2);
        func(s2, s1);
        func(s1, s2);
    }

    tm2 = time(NULL);

    printf("test(): %s() terminated in %d [sec].\n", szFuncName, (int)(tm2 - tm1));
    printf("test(): %s() answer s1[0] = %c.\n", szFuncName, s1[0]);
}

/******************************************************************************/
int main(int argc, char *argv[])
{
    printf("main(): Started.\n");

    test(100000000, 511, "strcpy_glibc", strcpy_glibc);
    test(100000000, 511, "strcpy", strcpy);
    test(100000000, 511, "strcpy_glibc", strcpy_glibc);
    test(100000000, 511, "strcpy", strcpy);

    printf("main(): Terminated.\n");
    return 0;
}

/******************************************************************************/
/* EOF */

And that result is here...

************************$ ./strcpy_test_3
main(): Started.
test(): strcpy_glibc() started, iLoop = 100000000, iLen = 511.
test(): strcpy_glibc() terminated in 238 [sec].
test(): strcpy_glibc() answer s1[0] = @.
test(): strcpy() started, iLoop = 100000000, iLen = 511.
test(): strcpy() terminated in 56 [sec].
test(): strcpy() answer s1[0] = @.
test(): strcpy_glibc() started, iLoop = 100000000, iLen = 511.
test(): strcpy_glibc() terminated in 238 [sec].
test(): strcpy_glibc() answer s1[0] = @.
test(): strcpy() started, iLoop = 100000000, iLen = 511.
test(): strcpy() terminated in 55 [sec].
test(): strcpy() answer s1[0] = @.
main(): Terminated.
************************$

Well, this means that strcpy() is faster 4 times than strcpy_glibc() but there code are same.

I'm very confused...

bolov
  • 72,283
  • 15
  • 145
  • 224
Amane
  • 3
  • 3
  • 3
    The library code has internal knowledge of the system and can therefore e.g. copy a whole machine word (could be 4 chars on a typical 32 bit architecture) at a time. You *can't* do that as long as you write well-defined, portable code. –  Apr 23 '18 at 06:35
  • 3
    Btw, glibc is open source, so, have a look (but don't copy implementation specific code in your application code). What you already found seems to be the generic "fallback", but I assure you there **are** more specific implementations in glibc and it will use the one that's optimal for the target platform. –  Apr 23 '18 at 06:37
  • Thank you for your response. I read your post and, I grep glibc source and I found more optimized code. (almost of them are assembla) I thought "fallback" is actual code... so I thought that I should be more optimize it. But it doesn't need now. – Amane Apr 23 '18 at 07:11
  • strcpy and various other common functions have architecture specific assembly, the c code is for architectures that don't have assembly versions yet – technosaurus Apr 23 '18 at 07:40
  • Thanks to your posting, I understood why exist the simple C source. – Amane Apr 23 '18 at 07:56

1 Answers1

-1

You can't directly copy libc code into application and expect for better performance because libc and OS has lot of specific code and internal knowledge, so its expected that the performance difference.

Try this:

static __inline__ __attribute__((always_inline))
char * strcpy_glibc(char * __restrict to, const char * __restrict from)
{
    char *save = to;

    for (; (*to = *from); ++from, ++to);
    return(save);
}

Instead of function pointer try to inline function in your application, if not to frequent calls. For sure will get much better performance but this code doesn't handle the corner cases and checks.

ntshetty
  • 1,293
  • 9
  • 20
  • 1
    Nope, it uses architecture specific assembly. – technosaurus Apr 23 '18 at 07:42
  • What do you mean of assembly ? – ntshetty Apr 23 '18 at 07:52
  • Thanks to your response. I understand that you are suggesting to use inline function when appropriate. I was not accustomed to reading the glibc source code, so I misunderstood the simple C source as being an actual implementation. But now I know it was very optimized. My question was solved. Thank you everyone. – Amane Apr 23 '18 at 08:48
  • Your answer is very helpful. I like it. Actually, I participate "stack overflow" just today. So still I do not know about this site so much. However, can I do something for you? I think I’ve not touched down button. And I just pushed the green check button. because I thought that this question had been resolved. – Amane Apr 23 '18 at 10:35
  • @Amane My intention is to help someone not for anything. If down voted it means its not helpful for the question – ntshetty Apr 24 '18 at 02:20
  • Well please let me ask question additionally. The point of my question is that enhance "fallback" is meaningful for glibc community? I also compared about strncpy. As a result, performance of them are same or sometimes upside down. It was strange so I checked their performance in different environment again. In that time, glibc is very faster than built by myself. – Amane Apr 24 '18 at 07:31
  • That's means in some case, "fallback" may be used. The reason I talk about it is that my strncpy is faster than "fallback". I mean, can I do something for glibc? Or is it not very meaningful? – Amane Apr 24 '18 at 07:32