4

I have this function to generate slugs in Coldfusion:

<cffunction name="generateSlug" output="false" returnType="string">
    <cfargument name="str">
    <cfargument name="spacer" default="-">

    <cfset var ret = "" />

    <cfset str = lCase(trim(str)) />
    <cfset str = reReplace(str, "[àáâãäå]", "a", "all") />
    <cfset str = reReplace(str, "[èéêë]", "e", "all") />
    <cfset str = reReplace(str, "[ìíîï]", "i", "all") />
    <cfset str = reReplace(str, "[òóôö]", "o", "all") />
    <cfset str = reReplace(str, "[ùúûü]", "u", "all") />
    <cfset str = reReplace(str, "[ñ]", "n", "all") />
    <cfset str = reReplace(str, "[^a-z0-9-]", "#spacer#", "all") />
    <cfset ret = reReplace(str, "#spacer#+", "#spacer#", "all") />

    <cfif left(ret, 1) eq "#spacer#">
        <cfset ret = right(ret, len(ret)-1) />
    </cfif>
    <cfif right(ret, 1) eq "#spacer#">
        <cfset ret = left(ret, len(ret)-1) />
    </cfif>

    <cfreturn ret />
</cffunction>

and then i am calling it using this:

<cfset stringToBeSlugged = "This is a string abcde àáâãäå èéêë ìíîï òóôö ùúûü ñ año ñññññññññññññ" />
<cfset slug = generateSlug(stringToBeSlugged) />
<cfoutput>#slug#</cfoutput>

But this is output me this slug:

this-is-a-string-abcde-a-a-a-a-a-a-e-e-e-e-i-i-i-i-o-o-o-o-u-u-u-u-n-a-no-n-n-n-n-n-n-n-n-n-n-n-n-n

it seems that all the accented characters are correctly replaced but this function is inserting a '-' after replacing them. Why?

Where is the error?

PD: i am expecting this output:

this-is-a-string-abcde-aaaaaa-eeee-iiii-oooo-uuuu-n-ano-nnnnnnnnnnnnn 

Thanks.

walolinux
  • 531
  • 1
  • 6
  • 20
  • what output you are expecting form the above function? – Keshav jha Apr 26 '16 at 06:11
  • then remove ` ` part – rock321987 Apr 26 '16 at 06:34
  • i mimicked your example in python and I don't think there is need of that part if what you expect as output is correct – rock321987 Apr 26 '16 at 06:36
  • @rock321987 sorry but it is not working. Same output, not as expected :-( – walolinux Apr 26 '16 at 07:23
  • 4
    working as it is on cf10,11 and 2016. Tested on trycf.com. http://trycf.com/gist/4f861b82a8c700e2d9dbefb896abb56e/acf?theme=monokai – Pankaj Apr 26 '16 at 08:13
  • Yes but not in cf8, that is the version of my server :-( – walolinux Apr 26 '16 at 09:01
  • When in doubt, look at your data. Specifically, output the str variable at the start of the function and then every time to do something to it. – Dan Bracuk Apr 26 '16 at 12:22
  • *output the str variable at the start of the function* ... and after each replace to find out which reReplace statement is the issue. In other words, add some debugging code to troubleshoot the issue. – Leigh Apr 26 '16 at 14:39
  • not that it'll help, but i tried it on coldfusion 9 and it worked fine – luke Apr 26 '16 at 15:56
  • What happens if you add another character to the replacement, like make `a` into `aA`? Just for debugging of course. – Laurel Apr 26 '16 at 16:45
  • Look at my pastebin: http://pastebin.com/ZWudrrsN i am getting crazy with CF8 and i cannot upgrade my server... – walolinux Apr 26 '16 at 18:00
  • Good example! For future reference, you might get weird results like that if the string is interpreted with a different/wrong encoding. – Leigh Apr 27 '16 at 00:54

1 Answers1

3

Does this work for you? (I've adapted a similar script that we use internally.) I believe that we used this with ColdFusion 8 as we are still use it w/CF9.

<cffunction name="generateSlug" output="false" returnType="string">
    <cfargument name="str" default="">
    <cfargument name="spacer" default="-">
    <cfset var ret = replace(arguments.str,"'", "", "all")>
    <cfset ret = trim(ReReplaceNoCase(ret, "<[^>]*>", "", "ALL"))>
    <cfset ret = ReplaceList(ret, "À,Á,Â,Ã,Ä,Å,Æ,È,É,Ê,Ë,Ì,Í,Î,Ï,Ð,Ñ,Ò,Ó,Ô,Õ,Ö,Ø,Ù,Ú,Û,Ü,Ý,à,á,â,ã,ä,å,æ,è,é,ê,ë,ì,í,î,ï,ñ,ò,ó,ô,õ,ö,ø,ù,ú,û,ü,ý,&nbsp;,&amp;", "A,A,A,A,A,A,AE,E,E,E,E,I,I,I,I,D,N,O,O,O,O,O,0,U,U,U,U,Y,a,a,a,a,a,a,ae,e,e,e,e,i,i,i,i,n,o,o,o,o,o,0,u,u,u,u,y, , ")>
    <cfset ret = trim(rereplace(ret, "[[:punct:]]"," ","all"))>
    <cfset ret = rereplace(ret, "[[:space:]]+","!","all")>
    <cfset ret = ReReplace(ret, "[^a-zA-Z0-9!]", "", "ALL")>
    <cfset ret = trim(rereplace(ret, "!+", arguments.Spacer, "all"))>
    <cfreturn ret>
</cffunction>

<cfset stringToBeSlugged = "This is a string abcde àáâãäå èéêë ìíîï òóôö ùúûü ñ año ñññññññññññññ" />
<cfoutput>"#stringToBeSlugged# = #generateSlug(stringToBeSlugged)#</cfoutput>

Support for more International Character

If you want to widen your support for international characters, you could use ICU4J (java) and Paul Hastings' Transliterator.CFC to transliterate all of the characters and then replace any remaining spaces, dashes, slashes, etc with dashes.

https://gist.github.com/JamoCA/ec4617b066fc4bb601f620bc93bacb57

http://site.icu-project.org/download

After installing both, you can convert non-Latin characters by identifying the language id (to be converted to) and pass the string to be converted:

<cfset Transliterator = CreateObject("component","transliterator")>

<cfoutput>
<cfloop array="#TestStrings#" index="TestString">
<h3>TestString = "#TestString#"</h3>
<blockquote>
    <div>CFC-1 = #Transliterator.transliterate('Latin-ASCII', TestString)#</div>
    <div>CFC-2 = #Transliterator.transliterate('any-NFD; [:nonspacing mark:] any-remove; any-NFC', TestString)#</div>       
</blockquote>
<hr>
</cfloop>
</cfoutput>

<h2>Available Language IDs</h2>
<cfdump var="#Transliterator.getAvailableIDs()#" label="Language IDs">
James Moberg
  • 4,360
  • 1
  • 22
  • 21