49

Is there a Delphi equivalent of this .net's method:

Url.UrlEncode()

Note
I haven't worked with Delphi for several years now. As I read through the answers I notice that there are several remarks and alternatives to the currently marked answer. I haven't had the opportunity to test them so I'm basing my answer on the most upvoted.
For your own sake, do check later answers and after deciding upvote the best answer so everybody can benefit from your experience.

Boris Callens
  • 90,659
  • 85
  • 207
  • 305

13 Answers13

112

Look at indy IdURI unit, it has two static methods in the TIdURI class for Encode/Decode the URL.

uses
  IdURI;

..
begin
  S := TIdURI.URLEncode(str);
//
  S := TIdURI.URLDecode(str);
end;
Mohammed Nasman
  • 10,992
  • 7
  • 43
  • 68
  • 6
    boris, come on, accept this answer, I just gave it a point for being totally helpful :) – Peter Perháč Apr 23 '09 at 13:59
  • Nice, I did not know this. Very helpful. – Hein du Plessis Mar 01 '11 at 09:00
  • 3
    @Peter Heh, I didn't check this question as I'm not working with Delphi anymore. But here you go anyway ;) – Boris Callens Sep 07 '11 at 13:15
  • 15
    But note the warnings in Marc Durdin's blog article "Indy, TIdURI.PathEncode, URLEncode and ParamsEncode and more" at http://marc.durdin.net/2012/07/indy-tiduripathencode-urlencode-and.html – Jan Doggen Sep 19 '12 at 12:50
  • 6
    Indy is not working properly so YOU NEED TO SEE THIS ARTICLE: http://marc.durdin.net/2012/07/indy-tiduripathencode-urlencode-and.html – Gabriel Apr 17 '14 at 13:02
  • 4
    Since Delphi xe7 you can use TNetEncoding.Url.Encode() wich is a smarter way and independend from Indi Components – Enny Mar 03 '16 at 08:30
  • 1
    The link to the mentioned article has changed. Here it is: https://marc.durdin.net/2012/07/indy-tiduri-pathencode-urlencode-and-paramsencode-and-more/ – ralfiii Feb 26 '19 at 13:16
  • If you have crosstalk then you can also code in c# in Visual Studio and have Delphi run via dll .. C# code being HttpUtility.UrlEncode(str) and HttpUtility.UrlDecode(str); – Allan F Mar 02 '23 at 02:59
30

Another simple way of doing this is to use the HTTPEncode function in the HTTPApp unit - very roughly

Uses 
  HTTPApp;

function URLEncode(const s : string) : string;
begin
  result := HTTPEncode(s);
end

HTTPEncode is deprecated in Delphi 10.3 - 'Use TNetEncoding.URL.Decode'

Uses
  NetEncoding;

function URLEncode(const s : string) : string;
begin
  result := TNetEncoding.URL.Encode(s);
end
Jesse Lee
  • 174
  • 1
  • 6
Alister
  • 6,527
  • 4
  • 46
  • 70
  • 1
    TNetEncoding.url.encode doesn't encode '@' properly and a couple of other symbols - be careful with it – fewrandom Jul 24 '20 at 18:21
  • 1
    Also there is `System.Net.URLClient` unit, which includes class function TURI.UrlEncode `class function TURI.URLEncode(const AValue: string; SpacesAsPlus: Boolean): string;` – vhanla Aug 17 '21 at 06:39
15

Since Delphi xe7 you can use TNetEncoding.Url.Encode()

V0d01ey
  • 47
  • 1
  • 6
Enny
  • 759
  • 5
  • 8
14

I made myself this function to encode everything except really safe characters. Especially I had problems with +. Be aware that you can not encode the whole URL with this function but you need to encdoe the parts that you want to have no special meaning, typically the values of the variables.

function MyEncodeUrl(source:string):string;
 var i:integer;
 begin
   result := '';
   for i := 1 to length(source) do
       if not (source[i] in ['A'..'Z','a'..'z','0','1'..'9','-','_','~','.']) then result := result + '%'+inttohex(ord(source[i]),2) else result := result + source[i];
 end;
Radek Hladík
  • 547
  • 4
  • 15
  • 1
    This should be the accepted answer. (not sure how it handles UTF-8 though) – Barry Staes Apr 24 '14 at 12:09
  • 1
    It has problem with unicode charecters. eg. %633%6CC%628 is the result of unicode string 'سیب' it will be decoded to 'c3lCb8' – Mahoor13 May 21 '15 at 05:52
  • 1
    Great answer. Surely, this and all the custom coded solutions on this page should only encode dangerous characters, rather than excluding safe characters. Only space, and characters that have special meaning in URIs need to be encoded. E.g. [Emb DokWiki](http://docwiki.embarcadero.com/Libraries/Tokyo/en/System.NetEncoding.TURLEncoding) says "TURLEncoding only encodes spaces (as plus signs: +) and the following reserved URL encoding characters: ;:&=+,/?%#[]." – Reversed Engineer Jun 05 '17 at 15:02
13

Another option, is to use the Synapse library which has a simple URL encoding method (as well as many others) in the SynaCode unit.

uses
  SynaCode;
..
begin
  s := EncodeUrl( str );
//
  s := DecodeUrl( str );
end;
skamradt
  • 15,366
  • 2
  • 36
  • 53
12

Update 2018: the code shown below seems to be outdated. see Remy's comment.

class function TIdURI.ParamsEncode(const ASrc: string): string;
var
  i: Integer;
const
  UnsafeChars = '*#%<> []';  {do not localize}
begin
  Result := '';    {Do not Localize}
  for i := 1 to Length(ASrc) do
  begin
    if CharIsInSet(ASrc, i, UnsafeChars) or (not CharIsInSet(ASrc, i, CharRange(#33,#128))) then begin {do not localize}
      Result := Result + '%' + IntToHex(Ord(ASrc[i]), 2);  {do not localize}
    end else begin
      Result := Result + ASrc[i];
    end;
  end;
end;

From Indy.


Anyway Indy is not working properly so YOU NEED TO SEE THIS ARTICLE:
http://marc.durdin.net/2012/07/indy-tiduri-pathencode-urlencode-and-paramsencode-and-more/

Gabriel
  • 20,797
  • 27
  • 159
  • 293
  • 8
    Altar and Marc Durdin are right. TIdURI is broken. Unit REST.Utils provides a function, URIEncode, that works properly. – James Roscoe Apr 16 '14 at 12:35
  • 1
    FYI, the code shown above is OLD. That is not what `TIdURI.ParamsEncode()` looks like anymore. In the latest version, the `UnsafeChars` has many more chars in it, Unicode is encoded correctly, and pre-existing `%HH` sequences are not double-encoded. – Remy Lebeau Nov 01 '18 at 17:15
  • @RemyLebeau the fact that pre-existing %HH sequences are not encoded is a bug, IMHO. If I ask to ENCODE a string, it should be encoded anyway, regardless of it being already (partially) encoded or not. The string 'ABC%DE', for example, doesn't encode correctly in TIdURI.Encode, as it's returned as-is, while it should become 'ABC%25DE'. – Bozzy Nov 10 '20 at 14:53
6

In recent versions of Delphi (tested with XE5), use the URIEncode function in the REST.Utils unit.

James Roscoe
  • 650
  • 5
  • 10
6

In a non-dotnet environment, the Wininet unit provides access to Windows' WinHTTP encode function: InternetCanonicalizeUrl

Stijn Sanders
  • 35,982
  • 11
  • 45
  • 67
4

I was also facing the same issue (Delphi 4).

I resolved the issue using below mentioned function:

function fnstUrlEncodeUTF8(stInput : widestring) : string;
  const
    hex : array[0..255] of string = (
     '%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07',
     '%08', '%09', '%0a', '%0b', '%0c', '%0d', '%0e', '%0f',
     '%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17',
     '%18', '%19', '%1a', '%1b', '%1c', '%1d', '%1e', '%1f',
     '%20', '%21', '%22', '%23', '%24', '%25', '%26', '%27',
     '%28', '%29', '%2a', '%2b', '%2c', '%2d', '%2e', '%2f',
     '%30', '%31', '%32', '%33', '%34', '%35', '%36', '%37',
     '%38', '%39', '%3a', '%3b', '%3c', '%3d', '%3e', '%3f',
     '%40', '%41', '%42', '%43', '%44', '%45', '%46', '%47',
     '%48', '%49', '%4a', '%4b', '%4c', '%4d', '%4e', '%4f',
     '%50', '%51', '%52', '%53', '%54', '%55', '%56', '%57',
     '%58', '%59', '%5a', '%5b', '%5c', '%5d', '%5e', '%5f',
     '%60', '%61', '%62', '%63', '%64', '%65', '%66', '%67',
     '%68', '%69', '%6a', '%6b', '%6c', '%6d', '%6e', '%6f',
     '%70', '%71', '%72', '%73', '%74', '%75', '%76', '%77',
     '%78', '%79', '%7a', '%7b', '%7c', '%7d', '%7e', '%7f',
     '%80', '%81', '%82', '%83', '%84', '%85', '%86', '%87',
     '%88', '%89', '%8a', '%8b', '%8c', '%8d', '%8e', '%8f',
     '%90', '%91', '%92', '%93', '%94', '%95', '%96', '%97',
     '%98', '%99', '%9a', '%9b', '%9c', '%9d', '%9e', '%9f',
     '%a0', '%a1', '%a2', '%a3', '%a4', '%a5', '%a6', '%a7',
     '%a8', '%a9', '%aa', '%ab', '%ac', '%ad', '%ae', '%af',
     '%b0', '%b1', '%b2', '%b3', '%b4', '%b5', '%b6', '%b7',
     '%b8', '%b9', '%ba', '%bb', '%bc', '%bd', '%be', '%bf',
     '%c0', '%c1', '%c2', '%c3', '%c4', '%c5', '%c6', '%c7',
     '%c8', '%c9', '%ca', '%cb', '%cc', '%cd', '%ce', '%cf',
     '%d0', '%d1', '%d2', '%d3', '%d4', '%d5', '%d6', '%d7',
     '%d8', '%d9', '%da', '%db', '%dc', '%dd', '%de', '%df',
     '%e0', '%e1', '%e2', '%e3', '%e4', '%e5', '%e6', '%e7',
     '%e8', '%e9', '%ea', '%eb', '%ec', '%ed', '%ee', '%ef',
     '%f0', '%f1', '%f2', '%f3', '%f4', '%f5', '%f6', '%f7',
     '%f8', '%f9', '%fa', '%fb', '%fc', '%fd', '%fe', '%ff');
 var
   iLen,iIndex : integer;
   stEncoded : string;
   ch : widechar;
 begin
   iLen := Length(stInput);
   stEncoded := '';
   for iIndex := 1 to iLen do
   begin
     ch := stInput[iIndex];
     if (ch >= 'A') and (ch <= 'Z') then
       stEncoded := stEncoded + ch
     else if (ch >= 'a') and (ch <= 'z') then
       stEncoded := stEncoded + ch
     else if (ch >= '0') and (ch <= '9') then
       stEncoded := stEncoded + ch
     else if (ch = ' ') then
       stEncoded := stEncoded + '+'
     else if ((ch = '-') or (ch = '_') or (ch = '.') or (ch = '!') or (ch = '*')
       or (ch = '~') or (ch = '\')  or (ch = '(') or (ch = ')')) then
       stEncoded := stEncoded + ch
     else if (Ord(ch) <= $07F) then
       stEncoded := stEncoded + hex[Ord(ch)]
     else if (Ord(ch) <= $7FF) then
     begin
        stEncoded := stEncoded + hex[$c0 or (Ord(ch) shr 6)];
        stEncoded := stEncoded + hex[$80 or (Ord(ch) and $3F)];
     end
     else
     begin
        stEncoded := stEncoded + hex[$e0 or (Ord(ch) shr 12)];
        stEncoded := stEncoded + hex[$80 or ((Ord(ch) shr 6) and ($3F))];
        stEncoded := stEncoded + hex[$80 or ((Ord(ch)) and ($3F))];
     end;
   end;
   result := (stEncoded);
 end;

source : Java source code

Abhineet
  • 403
  • 4
  • 14
  • This code (and its Java origin) couldn't be more inefficient - which programmer would ever define such an array instead of computing it? – AmigoJack Feb 17 '21 at 20:10
3

I have made my own function. It converts spaces to %20, not to plus sign. It was needed to convert local file path to path for browser (with file:/// prefix). The most important is it handles UTF-8 strings. It was inspired by Radek Hladik's solution above.

function URLEncode(s: string): string;
var
  i: integer;
  source: PAnsiChar;
begin
  result := '';
  source := pansichar(s);
  for i := 1 to length(source) do
    if not (source[i - 1] in ['A'..'Z', 'a'..'z', '0'..'9', '-', '_', '~', '.', ':', '/']) then
      result := result + '%' + inttohex(ord(source[i - 1]), 2)
    else
      result := result + source[i - 1];
end;       
GAD ZombiE
  • 51
  • 1
2

AFAIK you need to make your own.

Here is an example.

Toby Allen
  • 10,997
  • 11
  • 73
  • 124
Ólafur Waage
  • 68,817
  • 22
  • 142
  • 198
1

TIdUri or HTTPEncode has problems with unicode charactersets. Function below will do correct encoding for you.

function EncodeURIComponent(const ASrc: string): UTF8String;
const
  HexMap: UTF8String = '0123456789ABCDEF';

  function IsSafeChar(ch: Integer): Boolean;
  begin
    if (ch >= 48) and (ch <= 57) then Result := True    // 0-9
    else if (ch >= 65) and (ch <= 90) then Result := True  // A-Z
    else if (ch >= 97) and (ch <= 122) then Result := True  // a-z
    else if (ch = 33) then Result := True // !
    else if (ch >= 39) and (ch <= 42) then Result := True // '()*
    else if (ch >= 45) and (ch <= 46) then Result := True // -.
    else if (ch = 95) then Result := True // _
    else if (ch = 126) then Result := True // ~
    else Result := False;
  end;
var
  I, J: Integer;
  ASrcUTF8: UTF8String;
begin
  Result := '';    {Do not Localize}

  ASrcUTF8 := UTF8Encode(ASrc);
  // UTF8Encode call not strictly necessary but
  // prevents implicit conversion warning

  I := 1; J := 1;
  SetLength(Result, Length(ASrcUTF8) * 3); // space to %xx encode every byte
  while I <= Length(ASrcUTF8) do
  begin
    if IsSafeChar(Ord(ASrcUTF8[I])) then
    begin
      Result[J] := ASrcUTF8[I];
      Inc(J);
    end
    else if ASrcUTF8[I] = ' ' then
    begin
      Result[J] := '+';
      Inc(J);
    end
    else
    begin
      Result[J] := '%';
      Result[J+1] := HexMap[(Ord(ASrcUTF8[I]) shr 4) + 1];
      Result[J+2] := HexMap[(Ord(ASrcUTF8[I]) and 15) + 1];
      Inc(J,3);
    end;
    Inc(I);
  end;

  SetLength(Result, J-1);
end;
Alpay Abay
  • 11
  • 1
  • 4
  • 1
    I believe this is the proper credit for this bit of code: https://marc.durdin.net/2012/07/indy-tiduri-pathencode-urlencode-and-paramsencode-and-more/ And an updated version that also works on mobile platforms: https://marc.durdin.net/2015/08/an-update-for-encodeuricomponent/ – jep Nov 14 '18 at 16:35
  • 1
    It should also be noted in this code (as on the website it came from), space is incorrectly encoded as `+`. That's not how encodeURIComponent should work. It should encode it as %20 instead: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent It's fixed in the mobile-friendly version, though. – jep Nov 14 '18 at 17:06
0

I'd like to point out that if you care much more about correctness than about efficiency, the simplest you can do is hex encode every character, even if it's not strictly necessary.

Just today I needed to encode a few parameters for a basic HTML login form submission. After going through all the options, each with their own caveats, I decided to write this naive version that works perfectly:

function URLEncode(const AStr: string): string;
var
  LBytes: TBytes;
  LIndex: Integer;
begin
  Result := '';
  LBytes := TEncoding.UTF8.GetBytes(AStr);
  for LIndex := Low(LBytes) to High(LBytes) do
    Result := Result + '%' + IntToHex(LBytes[LIndex], 2);
end;
Thijs van Dien
  • 6,516
  • 1
  • 29
  • 48