0

I have a function that is called heavily in my app. It is basically a csv parser, which 'pops' the first value off and changes the input string to the remaining string.

function StripString(mask: string; var modifiedstring: string): string;
var
  index,len: integer;
  s: string;
begin
    Index := pos(Mask, ModifiedString);
    len := Length(ModifiedString);
    if index <> 0 then
    begin
        if Index <> 1 then
        begin
            s := LeftStr(ModifiedString, index - 1);
            ModifiedString := RightStr(ModifiedString, len-index);
        end else begin
            if Length(ModifiedString)>1 then
              ModifiedString := Copy(ModifiedString,2,len)
            else
              ModifiedString := '';
            s := '';
        end;
    end else begin
        s := ModifiedString;
        ModifiedString := '';
    end;
    result := s
end;

I want to try and optimize this routine, using PChars. So i came up with this method, but unfortunately i get weird characters in the resultant output. Im guessing its because of incorrect pointers.

//faster method - uses PChars
function StripStringEx(mask: char; var modifiedstring: string): string;
var
  pSt,pCur,pEnd : Pchar;
begin
    pEnd := @modifiedString[Length(modifiedString)];
    pSt := @modifiedString[1];
    pCur := pSt;
    while pCur <= pEnd do
    begin
         if pCur^ = mask then break;
         inc(pCur);
    end;
    SetString(Result,pSt,pCur-pSt);
    SetString(ModifiedString,pCur+1,pEnd-pCur);
end;

Anyone "point":) me in the right direction?

Simon
  • 9,197
  • 13
  • 72
  • 115
  • 1
    Maybe you should process the string, breaking it all in only one step, returning a list of strings. – Clóvis Valadares Junior Apr 19 '11 at 12:17
  • 1
    Step 1: Write a bunch of unit tests to ensure that your routine does what you want it to do and that every corner case you can think of is covered and tested. Step 2: Refactor. Step 3: Avoid premature optimization. – Nick Hodges Apr 19 '11 at 12:55

2 Answers2

3

Even if you get your pointer version to work I don't see why it would be faster.

Calling Pos is faster than your loop since Pos is reasonably optimized. The allocation patterns are the same with both versions, two heap allocations and a heap deallocation. I'd stick with the version that works.

You can get rid of the local variable s and assign direct to Result to skip a bit of reference counting.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 2
    He can also save on reference counting by making mask a const: `const mask: string` – The_Fox Apr 19 '11 at 09:00
  • @The_Fox That's a good point. It does more than that though, it saves on a Try/Finally stack push/pop which is somewhat expensive on x86. – David Heffernan Apr 19 '11 at 09:01
  • 1
    a try/finally block is not so expensive to make a difference with such a complex function. But it always makes sense to always define string parameters as const, when possible. – Arnaud Bouchez Apr 19 '11 at 09:50
1

I think you get strange results because of the SetString on ModifiedString. SetString first sets the length of the string, and then copies the content from the buffer to the newly created string. But in your case the buffer is the destination, and the length of the buffer just got adjusted.

Just follow David's advise and do not use PChars.

If you want, you can make it somewhat shorter:

function StripString(const Mask: string; var ModifiedString: string): string;
var
  Index: Integer;
begin
  Index := Pos(Mask, ModifiedString);
  if Index <> 0 then
  begin
    Result := LeftStr(ModifiedString, Index - 1);
    Delete(ModifiedString, Index);
  end
  else
  begin
    Result := ModifiedString;
    ModifiedString := '';  
  end;
end;
The_Fox
  • 6,992
  • 2
  • 43
  • 69