0

Here is I what I am doing, basically sort an array of dynamically generated C-Strings, it's going to be a combination of "abc", and the length is less than 5 for the sake of brevity. What is confusing/interesting is how to configure the compare function so it won't compare the C-strings by memory addresses

srand ( time(NULL) );
char alpha[] = "abc";
char** CString = new char*[either 5 or 1000];
unsigned int j=0;
for (unsigned int i=0; i<either 5 or 1000;i++) {
    int ran = rand() % 5 + 2;
    CString[i] = new char[ran];
    for(j=0;j<ran-1;j++){
        CString[i][j] = alpha[rand() % (sizeof(alpha) - 1)];
    }
    CString[i][ran-1] = '\0';
}

std::sort(CString,CString+either 5 or 1000,SortCompare);

for(int i=0;i<5;i++){
    std::cout << *(CString+i) << " at " << CString+i << std::endl;
}

now I have three configurations for the compare function

int SortCompare(char* a,  char* b){
    //return a<b;
    //return *a<*b;
    //return strcmp(a,b);
}

and the printout was

return strcmp(a,b):
CRASHED! //bummed because I had high hope for this

return a<b:
(when 5 C-strings):                        (when 1000 C-strings):
abba at 001F3248                           cbccb at 00544388 
bcb at 001F324C                            caac at 0054438C
cbb at 001F3250                            bcbc at 00544390
c at 001F3254                              ac at 00544394
ca at 001F3258                             a at 00544398
//conclusion: it's sorted by addresses. so turning head to the other one

return *a<*b:
(when 5 C-strings):                        (when 1000 C-strings):
abba at 001F3248                           cbccb at 00544388
bcb at 001F324C                            caac at 0054438C
cbb at 001F3250                            bcbc at 00544390
c at 001F3254                              ac at 00544394
ca at 001F3258                             a at 00544398
//I assumed it's value-sorted              //seriously hurt, belief has been destroyed seeing the memory addresses line up so neatly

Therefore, which one is the correct version to sort by value? Or I am totally on a wrong track. Needed a life guard! Thanks

Cong Hui
  • 633
  • 1
  • 13
  • 25
  • You're using the stl::sort yet you're not using std::vector to store your strings or even std::vector to store each string. Why? – Borgleader Sep 25 '12 at 00:30
  • Well, I tried vector, it worked, but I was like here is an idea, now here I am. By the way, since STL is generic,I assumed it should work with this configuration,right? – Cong Hui Sep 25 '12 at 00:32
  • 1
    Or better yet, using `std::string`... why are you using raw C-style strings? – Cornstalks Sep 25 '12 at 00:32
  • hmm. sometime I had to use raw C-Strings for performance, at least that's what my professor demanded.. :( – Cong Hui Sep 25 '12 at 00:34
  • 1
    @ClintHui: performance argument is dubious, plus it doesn't matter how much faster it is if it doesn't work! – Joe Sep 25 '12 at 00:35
  • @Joe, I was hoping this could work out... and it'd better be – Cong Hui Sep 25 '12 at 00:39
  • @ClintHui: Why did the `strcmp` version crash? Because it shouldn't ([I just hacked your code into a program](http://ideone.com/VAsc6)). Use a debugger to see why it crashes. – Cornstalks Sep 25 '12 at 00:41
  • @Cornstalks hmm, surprised to see that. but it did crash on my g++4.7 and VS2012..., it crashed into on VS2012... – Cong Hui Sep 25 '12 at 00:48
  • 2
    I assume the crash was actually an assertion informing you that your sorter did not produce a strict weak ordering criterion (both positive and negative numbers will produce `true` whereas only negative numbers should). Once you call `strcmp` _correctly_, I suspect the "crash" will go away. :-] – ildjarn Sep 25 '12 at 00:56
  • @ildjarn, yea, **Assertion Failed!**, I am shamed to say that I considered that a crash cuz I haven't used VS for long. – Cong Hui Sep 25 '12 at 00:57
  • @Clint : Assertions are debug-mode checks (that may be slow) that check for valid preconditions and fail if they're not met. If you go to the source of the assertion in a debugger, you'll often find comments in the code telling you what the precondition is and/or why it was not met. – ildjarn Sep 25 '12 at 23:04

1 Answers1

3

If you never have any NULL pointers:

bool SortCompare(char const* const a, char const* const b)
{
    return std::strcmp(a, b) < 0;
}

If you do have NULL pointers, it's only slightly more verbose:

bool SortCompare(char const* const a, char const* const b)
{
    return a && (!b || std::strcmp(a, b) < 0);
}
ildjarn
  • 62,044
  • 9
  • 127
  • 211
  • OMG! do you care to shed some lights on the arguments? why is that? Thank you – Cong Hui Sep 25 '12 at 00:43
  • @ClintHui : I'm not sure what needs explaining -- `std::sort` requires a comparitor that induces a strict weak ordering. For simple lexicographical sorting, simple `a < b` criteria suffices, and `strcmp` returns less than 0 if its first argument is less than its second. – ildjarn Sep 25 '12 at 00:49
  • I meant the argument, **(char const* const a, char const* const b)**, what is the second **const** doing? why is it there? Thanks – Cong Hui Sep 25 '12 at 00:50
  • @ClintHui : The second `const` is not strictly required, but I put it there for matters of basic [const-correctness](http://www.parashift.com/c++-faq/const-correctness.html). What it's doing is declaring (for human readers) and enforcing (for the compiler) that you will not change the values of `a` or `b`. – ildjarn Sep 25 '12 at 00:51
  • 1
    Changing the values of `a` or `b` would have no effect, since they're copies anyway, so it doesn't really do anything in this case, or even in most cases. It's just like taking `int const a` as an argument. But when you're writing a complicated function that has a `return a` on line 137, it's useful to know that you're returning exactly the same value that was passed in, and that's what `int const a` or `char const * const a` guarantees. – abarnert Sep 25 '12 at 01:01
  • @ildjarn: you say "std::sort requires a comparitor that induces a strict weak ordering", and "have NULL pointers, it's only slightly more verbose" - but `a && (!b || std::strcmp(a, b) < 0);` for two NULLs is always false - it doesn't achieve strict weak ordering.... – Tony Delroy Sep 25 '12 at 02:16
  • @Tony : I think it does -- two nulls are always equal, and a null is never less than a non-null. How is that different from e.g. `std::vector v(10); std::sort(begin(v), end(v));` wherein the comparitor will also always return `false`? – ildjarn Sep 25 '12 at 02:18
  • @ildjarn: I'll go back to sleep :-(. Cheers. – Tony Delroy Sep 25 '12 at 02:34
  • Is the logic in the comparison not a bit unnatural? `SortCompare(0,"Hi")` --> `false` seems to be a bit counterintuitive (NULL is more than "Hi"). – David Rodríguez - dribeas Sep 25 '12 at 03:43
  • @David : Personally, when I'm doing a lexographical sort with data that can potentially be NULL, I like the "real" data at the beginning and the NULL data at the end so I can focus on the real data. But your point is well taken – in the 'intuitive' case where NULL should be less than anything non-NULL, one can instead use `return b && (!a || std::strcmp(a, b) < 0);`. – ildjarn Sep 25 '12 at 03:59