0

we currently have been making use of a UDF we use from cflib.org and whilst its a wonderful UDF that alters strings to be pretty on the eye, we cant seem to figure out how to make it allow a trailing "." at the end of the string if it is entered.

<cffunction name="friendlyURL" output="false" access="public" returntype="string" hint="returns a URL safe string">
<cfargument name="string" default="" type="string">
<cfscript>
var returnString = arguments.string;
var InvalidChars = "à,ô,d,?,ë,š,o,ß,a,r,?,n,a,k,s,?,n,l,h,?,ó,ú,e,é,ç,?,c,õ,?,ø,g,t,?,e,c,s,î,u,c,e,w,?,u,c,ö,è,y,a,l,u,u,s,g,l,ƒ,ž,?,?,å,ì,ï,?,t,r,ä,í,r,ê,ü,ò,e,ñ,n,h,g,d,j,ÿ,u,u,u,t,ý,o,â,l,?,z,i,ã,g,?,o,i,ù,i,z,á,û,þ,ð,æ,µ,e,À,Ô,D,?,Ë,Š,O,A,R,?,N,A,K,S,?,N,L,H,?,Ó,Ú,E,É,Ç,?,C,Õ,?,Ø,G,T,?,E,C,S,Î,U,C,E,W,?,U,C,Ö,È,Y,A,L,U,U,S,G,L,ƒ,Ž,?,?,Å,Ì,Ï,?,T,R,Ä,Í,R,Ê,Ü,Ò,E,Ñ,N,H,G,Ð,J,Ÿ,U,U,U,T,Ý,O,Â,L,?,Z,I,Ã,G,?,O,I,Ù,I,Z,Á,Û,Þ,Ð,Æ,?,E";
var ValidChars   = "a,o,d,f,e,s,o,ss,a,r,t,n,a,k,s,y,n,l,h,p,o,u,e,e,c,w,c,o,s,o,g,t,s,e,c,s,i,u,c,e,w,t,u,c,oe,e,y,a,l,u,u,s,g,l,f,z,w,b,a,i,i,d,t,r,ae,i,r,e,ue,o,e,n,n,h,g,d,j,y,u,u,u,t,y,o,a,l,w,z,i,a,g,m,o,i,u,i,z,a,u,th,dh,ae,u,e,A,O,D,F,E,S,O,A,R,T,N,A,K,S,Y,N,L,H,P,O,U,E,E,C,W,C,O,S,O,G,T,S,E,C,S,I,U,C,E,W,T,U,C,Oe,E,Y,A,L,U,U,S,G,L,F,Z,W,B,A,I,I,D,T,R,Ae,I,R,E,Ue,O,E,N,N,H,G,D,J,Y,U,U,U,T,Y,O,A,L,W,Z,I,A,G,M,O,I,U,I,Z,A,U,TH,Dh,Ae,U,E";
// trim the string
returnString = Trim(returnString);
returnString = StripCR(returnString);
// replace known characters with the corresponding safe characters
returnString = ReplaceList(returnString,InvalidChars,ValidChars);
// replace unknown characters in the x00-x7F-range with x's
returnString = returnString.ReplaceAll('[^\x00-\x7F]','x');
// Replace one or many comma with a dash
returnString = returnString.ReplaceAll(',+', '-');
// Other substitutions
returnString = Replace(returnString, "%", "percent","ALL");
returnString = Replace(returnString, "&amp;", " and ","ALL");
returnString = Replace(returnString, "&", " and ","ALL");
returnString = returnString.ReplaceAll('[:,/]', '-');
// Replace one or more whitespace characters with a dash
returnString = returnString.ReplaceAll('[\s]+', '-');
// And everything else simply has to go
returnString = returnString.ReplaceAll('[^A-Za-z0-9\/-]','');
// finally replace multiple dash characters with just one
returnString = returnString.ReplaceAll('-+','-');
// we're done
return returnString;
</cfscript>
</cffunction>

My Regex, isnt very good and Ive already spent a few hours tinkering with it but still cant seem to figure out how to enable it to have a trailing "." if one is entered.

user125264
  • 1,809
  • 2
  • 27
  • 54

2 Answers2

4
// And everything else simply has to go
returnString = returnString.ReplaceAll('[^A-Za-z0-9\/-]','');

This will replace any character that isin't part of the set with empty string. Try adding the . character at the end of the set:

 returnString = returnString.ReplaceAll('[^A-Za-z0-9.-]','');

Note: As pointed out by @PeterBoughton, the / are being replaced by - already, so it doesn't have to be part of the character set.

Now, that will allow . characters in the middle of the string and you wanted to allow them only at the end, so you will have to replace any . character which is not at the end.

returnString = returnString.replaceAll('\.(?!$)', '');

Otherwise, you could always try to match a . which is followed by any characters and replace both with that character using backreferences.

returnString = returnString.replaceAll('\.(.)', '\1');
plalx
  • 42,889
  • 6
  • 74
  • 90
  • You've made the same mistake I did. You want `'[^A-Za-z0-9\/.-]'` instead of `'[^A-Za-z0-9\/-.]'`. The `-` must be the last character in the set. – pburka Oct 22 '13 at 02:10
  • Right ;) Haven't noticed that there was a `-` char hehe. Fixed! – plalx Oct 22 '13 at 02:18
  • Of course, if the hyphen was escaped the positioning wouldn't be an issue. Also, the `/` does _not_ need escaping - although given that an earlier line changes all `/` to `-` its redundant anyway, so could just be `[^A-Za-z0-9\-.]` – Peter Boughton Oct 22 '13 at 08:56
  • And yes, CF does support lookaheads (but not lookbehinds) - however, calling the string.replaceAll method is using Java's regex (which supports both). – Peter Boughton Oct 22 '13 at 08:57
  • @PeterBoughton Good catch. I had not really looked at what other replacements were doing ;) – plalx Oct 22 '13 at 15:26
1

This line

    // And everything else simply has to go
    returnString = returnString.ReplaceAll('[^A-Za-z0-9\/-]','');

removes everything which isn't A-Z, a-z, 0-9, /, or -.

(The [] brackets enclose a list of possible characters to match, and ^ means "not").

You should just be able to add . to the list:

    // And everything else simply has to go
    returnString = returnString.ReplaceAll('[^A-Za-z0-9\/.-]','');

(As @Leigh alludes to below, the . must be before the final - or it will be treated as a range.)

pburka
  • 1,434
  • 9
  • 12
  • Adding `.` to the end does not *"allow a trailing period at the **end** of the string"*. Also, the expression above does not even compile. – Leigh Oct 22 '13 at 01:56
  • It certainly does "allow a trailing period at the end of the string". – pburka Oct 22 '13 at 02:01
  • Well it is not doing anything right now as it does not compile ;) But my take was the OP wanted to allow it *only* at the end of the string. {Shrug} I could be wrong though. (Edit) Never mind, I see it was edited to fixed the compilation error. – Leigh Oct 22 '13 at 02:05