0

I have been working on a project in Lazarus and have decided to move it to Delphi XE for the time being (due to some limitations).

A brief overview of what is going on:

At runtime I am loading external files and adding them to streams. The streams belong to several different classes that descend from one main object (TObject). These classes are added to a TList from the main object, basically each class has its own stream property and the class is child to the main object.

In this main object I have a save and load procedure:

When saving the object it also saves all the stream data from the other classes to file by using string to stream. The output string here must be base64 encoded as I am saving to XML.

When opening the file, the idea is to decode the base64 string and move it back into the streams just as if it were the original file before it was base64 encoded.

In Lazarus it works, and here is the important code (note, some of it was not written by me).

const
  Keys64 = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/';

function Encode64String(S: string): string;
function Decode64String(S: string): string;
function Encode64StringToStream(const Input: TStream; var Output: string): Boolean;
procedure Decode64StringToStream(const Input: string; Output: TStream);
procedure StringToStream(Stream: TStream; const S: string);
function StreamToString(MS: TMemoryStream): string;

implementation

function Encode64String(S: string): string;
var
  i: Integer;
  a: Integer;
  x: Integer;
  b: Integer;
begin
  Result := '';
  a := 0;
  b := 0;
  for i := 1 to Length(s) do
  begin
    x := Ord(s[i]);
    b := b * 256 + x;
    a := a + 8;
    while a >= 6 do
    begin
      a := a - 6;
      x := b div (1 shl a);
      b := b mod (1 shl a);
      Result := Result + Keys64[x + 1];
    end;
  end;
  if a > 0 then
  begin
    x := b shl (6 - a);
    Result := Result + Keys64[x + 1];
  end;
end;

function Decode64String(S: string): string;
var
  i: Integer;
  a: Integer;
  x: Integer;
  b: Integer;
begin
  Result := '';
  a := 0;
  b := 0;
  for i := 1 to Length(s) do
  begin
    x := Pos(s[i], Keys64) - 1;
    if x >= 0 then
    begin
      b := b * 64 + x;
      a := a + 6;
      if a >= 8 then
      begin
        a := a - 8;
        x := b shr a;
        b := b mod (1 shl a);
        x := x mod 256;
        Result := Result + chr(x);
      end;
    end
    else
      Exit;
  end;
end;

function Encode64StringToStream(const Input: TStream; var Output: string): Boolean;
var
  MS: TMemoryStream;
begin
  Result := False;

  MS := TMemoryStream.Create;
  try
    Input.Seek(0, soFromBeginning);
    MS.CopyFrom(Input, Input.Size);
    MS.Seek(0, soFromBeginning);
    Output := Encode64String(StreamToString(MS));
  finally
    MS.Free;
  end;

  Result := True;
end;    

procedure Decode64StringToStream(const Input: string; Output: TStream);
var
  MS: TMemoryStream;
begin
  try
    MS := TMemoryStream.Create;
    try
      StringToStream(MS, Decode64String(Input));

      MS.Seek(0, soFromBeginning);
      Output.CopyFrom(MS, MS.Size);
      Output.Position := 0;
    finally
      MS.Free;
    end;

  except on E: Exception do
    raise Exception.Create('stream decode error - ' + E.Message);
  end;
end;

procedure StringToStream(Stream: TStream; const S: string);
begin
  Stream.Write(Pointer(S)^, Length(S));
end;

function StreamToString(MS: TMemoryStream): string;
begin
  SetString(Result, PChar(MS.Memory), MS.Size div SizeOf(Char));
end;

I am 99% sure the problem here is going to be unicode related. It's a shame because I believe Lazarus/Freepascal has always been unicode but not Delphi and so uses different string types making it almost impossible for the less professional users like myself to solve!

To be honest I think all the code above is a bit of a mess, and it feels like I am just trying to guess what to change the strings to without really knowing what I am doing.

My first thought was to change everything from String to AnsiString. This nearly worked one time but when trying to use Decode64StringToStream I got zero data back. Other times the data was not properly saving as base64 encoded format, and sometimes I even got errors like TStream.Seek not implemented or something.

PS, I have read the guides and there is plenty around such as the whitepapers etc on how to migrate old Delphi projects to newer unicode versions and to be honest I am still at a loss with it. I thought replacing string to AnsiString would have been enough, but it seems it isn't.

Any tips, pointers or general advice or clues would be greatly appreciated thanks.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 1
    What is the question? If you want to support Unicode, why would you use `AnsiString`? If you want to use base64, use the built in base64 library, `EncdDecd`. If you want to encode a string to base64, convert it to UTF-8, and then base64 encode. And obviously reverse the process in the other direction. Do you just want to know how to do that? – David Heffernan Oct 01 '13 at 10:03
  • 1
    The bottom line here is that you need to understand what you are doing and have a clear picture of what you are trying to achieve. I can see lots of code, and lots of text, but no statement of the precise transformation you wish to apply. You cannot handle text encodings by trial and error. Step one for you is to gain understanding. – David Heffernan Oct 01 '13 at 10:13
  • @DavidHeffernan the question i guess is i am at a loss and that all these different unicode and string types are beyond me. You have given me some tips in your comments so i want to have a go again with this information and see what i can do myself... –  Oct 01 '13 at 10:13
  • Also, I seem to recall that FreePascal used UTF-8 for its string type not Unicode. But it's been a while since I dabbled. – shunty Oct 01 '13 at 10:17

1 Answers1

4

I think what you want to do is:

  1. Convert the Unicode string to UTF-8 encoding. This is often the most space efficient format for Unicode text.
  2. Encode the string using base64.

Then to decode you just reverse the steps.

The code looks like this:

function Encode(const Input: string): AnsiString;
var
  utf8: UTF8String;
begin
  utf8 := UTF8String(Input);
  Result := EncdDecd.EncodeBase64(PAnsiChar(utf8), Length(utf8));
end;

function Decode(const Input: AnsiString): string;
var
  bytes: TBytes;
  utf8: UTF8String;
begin
  bytes := EncdDecd.DecodeBase64(Input);
  SetLength(utf8, Length(bytes));
  Move(Pointer(bytes)^, Pointer(utf8)^, Length(bytes));
  Result := string(utf8);
end;
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • This is no good, nor is the question in all honesty. You are absolutely correct in your earlier comments, I just seem to be randomly changing values in the hope that it works without any real understanding of what I am doing, or what I am changing. This probably explains why nearly all my questions are a bit difficult to understand, because I am asking them in a `confused state`. Although I never solved my problem despite hours and hours of trying myself, I will accept this answer. I need to take a step back, although I am comfortable with some things I must get a grip on the rest. Thanks. –  Oct 03 '13 at 11:44
  • Well, "no good" is a bit strong. I mean the code does what I say it does. But yeah, perhaps I've not understood what you are actually asking. – David Heffernan Oct 03 '13 at 12:18
  • Sorry i did not mean your answer was no good, i meant my whole approach and attitude to programming is wrong. –  Oct 03 '13 at 15:20
  • No worries. Text and encodings and conversions can be bewildering. No shame to get confused with this stuff. I'm not sure I understand it all that well! – David Heffernan Oct 03 '13 at 15:27