Problem with anchor links using resolveurl

Question

I'm using <cfhttp> to pull in content from another site (coldfusion) and resolveurl="true" so all the links work. The problem I'm having is resolveurl is making the anchor links (href="#search") absolute links as well breaking them. My question is is there a way to make resolveurl="true" bypass anchor links somehow?

Loop through the result and run a REPLACE on relevant links to trim them back down to anchors? https://helpx.adobe.com/coldfusion/cfml-reference/coldfusion-functions/functions-m-r/rematchnocase.html — TRose, Oct 09 '19 at 17:25
Thanks... my coldfusion skills are limited. Im not sure how to code that out. This is the link href="https://www.ccri.edu:443/_resources-2019/includes/#search" this is what it should be href="#search" — dbaker6, Oct 09 '19 at 19:05
*as well breaking them* Breaking them HOW? An absolute url with an anchor is perfectly valid. — SOS, Oct 14 '19 at 17:07

TRose · Answer 1 · 2019-10-09T20:05:02.590

For starters, let's use the tutorial code from Adobe.com posted in the comments. You'll want to do something similar.

<cfhttp url="https://www.adobe.com" 
 method="get" result="httpResp" timeout="120">
    <cfhttpparam type="header" name="Content-Type" value="application/json" />
</cfhttp>

<cfscript>
    // Find all the URLs in a web page retrieved via cfhttp
    // The search is case sensitive
   result = REMatch("https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?", httpResp.Filecontent);
</cfscript>

    <!-- Now, Loop through those URLs--->

<cfoutput>
<cfloop array="#result#" item="item" index="index">
<cfif LEFT(item, 1) is "##"> 
<!---Your logic if it's just an anchor---> 
<cfelse> 
<!---Your logic if it's a full link--->
</cfif> 
<br/>
</cfloop>
</cfoutput>

If it tries to return a full URL before the anchor as you say, (I've been getting inconsistent results with resolveurl="true") hit it with this to only grab the bit you want.

<cfoutput>
<cfloop array="#result#" item="item" index="index">
#ListLast(item, "##")#
</cfloop>
</cfoutput>

What this code does is grab all the URLs, and parse them for anchors.

You'll have to decide what to do next inside your loop. Maybe preserve the values and add them to a new array, so you can save it somewhere with the links fixed?

It's impossible to assume in a situation like this.

score 0 · Answer 2 · answered Oct 10 '19 at 13:26

There does not appear to be a way to prevent CF from resolving the hashes. In our usage of it the current result is actually beneficial since when we present content from another site we usually want the user to be sent there.

Here is a way to replace link href values with just anchor if one is present using regular expressions. I'm sure there are combinations of issues that could occur here if really malformed html.

<cfsavecontent variable="testcontent">
    <strong>test</strong>
    <a href="http://google.com">go to google</a>
    <a href="http://current.domain/thispage#section">go to section</a>
</cfsavecontent>

<cfset domain = replace("current.domain", ".", "\.", "all") />
<cfset match = "(href\s*=\s*(""|'))\s*(http://#domain#[^##'""]+)(##[^##'""]+)\s*(""|')" />
<cfset result = reReplaceNoCase(testcontent, match, "\1\4\6", "all") />

<cfoutput><pre>#encodeForHTML(result)#</pre></cfoutput>

Output

    <strong>test</strong>
    <a href="http://google.com">go to google</a>
    <a href="#section>go to section</a>

Another option if you are displaying the content in a normal page with js/jquery available is to run through each link on display and update it to just be the anchor. This will be less likely error with malformed html. Let me know if you have any interest in that approach.

Problem with anchor links using resolveurl

2 Answers2