3

Our application imports binaries (mostly PDF) from a legacy system and stores them on a page together with some metadata.

If there was a change the page automatically gets activated. We see the replication events in the replication log and also on the dispatcher an invalidate event is logged. But there is no eviction entry and this the old binary is still cached.

We also have HTML pages next to these container pages for the binaries and they work as expected. Here the two log entries for the successful html and the unsuccessful PDF:

OK:

[Thu Jul 03 09:26:33 2014] [D] [27635(24)] Found farm website for localhost:81 
[Thu Jul 03 09:26:33 2014] [D] [27635(24)] checking [/dispatcher/invalidate.cache] 
[Thu Jul 03 09:26:33 2014] [I] [27635(24)] Activation detected: action=Activate [/content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/test] 
[Thu Jul 03 09:26:33 2014] [I] [27635(24)] Touched /app/C2Z/dyn/c2zcqdis/docroot/.stat 
[Thu Jul 03 09:26:33 2014] [I] [27635(24)] Evicted /app/C2Z/dyn/c2zcqdis/docroot/content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/test.html 
[Thu Jul 03 09:26:33 2014] [D] [27635(24)] response.status = 200 
[Thu Jul 03 09:26:33 2014] [D] [27635(24)] response.headers[Server] = "Communique/2.6.3 (build 5221)" 
[Thu Jul 03 09:26:33 2014] [D] [27635(24)] response.headers[Content-Type] = "text/html" 
[Thu Jul 03 09:26:33 2014] [D] [27635(24)] cache flushed 
[Thu Jul 03 09:26:33 2014] [I] [27635(24)] "GET /dispatcher/invalidate.cache" 200 13 2ms

Not OK

[Thu Jul 03 09:30:45 2014] [D] [27635(24)] Found farm website for localhost:81 
[Thu Jul 03 09:30:45 2014] [D] [27635(24)] checking [/dispatcher/invalidate.cache] 
[Thu Jul 03 09:30:45 2014] [I] [27635(24)] Activation detected: action=Activate [/content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf] 
[Thu Jul 03 09:30:45 2014] [I] [27635(24)] Touched /app/C2Z/dyn/c2zcqdis/docroot/.stat 
[Thu Jul 03 09:30:45 2014] [D] [27635(24)] response.status = 200 
[Thu Jul 03 09:30:45 2014] [D] [27635(24)] response.headers[Server] = "Communique/2.6.3 (build 5221)" 
[Thu Jul 03 09:30:45 2014] [D] [27635(24)] response.headers[Content-Type] = "text/html" 
[Thu Jul 03 09:30:45 2014] [D] [27635(24)] cache flushed 
[Thu Jul 03 09:30:45 2014] [I] [27635(24)] "GET /dispatcher/invalidate.cache" 200 13 1ms

The PDF in this case is stored in a node called 'download' directly below the jcr:content node. It's html container is never called directly and this is not available on the dispatcher. So a user directly requests the file: /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf/jcr%3acontent/download/file.res/as2p_vvm_ch_gl_fix_chf_.pdf

In the dispatcher.any we flush all html pages on activation, but not for the binaries. For testing, we added an allow *.pdf but this didn't help anyway.

/invalidate
{
/0000
  {
  /glob "*"
  /type "deny"
  }
/0001
  {
  /glob "*.html"
  /type "allow"
  }
}

In my opinion, the invalidate call should just delete the whole folder: /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf

Any ideas why our binaries do not get flushed?

UPDATE: In another post the statfileslevel property in the dispatcher.any is mentioned. In our environment this is commented out. Could it be that this could be the problem. Sadly I don't fully understand how this is supposed to work. Is the level meant from the wwwroot or from the page that is activated?

Thomas
  • 6,325
  • 4
  • 30
  • 65

2 Answers2

2

It looks like your problem with dispatcher flushing is that the path the file is being served from is using jcr%3acontent when it should use _jcr_content.

Dispatcher flushing deletes the folder _jcr_content under the path that is being flushed. It does not delete jcr%3acontent (urldecoded as jcr:content). So you should instead serve the pdf using this URL: /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf/_jcr_content/download/file.res/as2p_vvm_ch_gl_fix_chf_.pdf

This would then cache the pdf file under: {CACHEROOT}/content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf/_jcr_content/download/file.res/as2p_vvm_ch_gl_fix_chf_.pdf

Then when this path is flushed it will delete the subdirectory _jcr_content under the path of the flush /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf

To go into more detail, when you issue a flush request for path above then the following files and directories are deleted:

  • /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf.* where * is a wildcard
  • /content/offering/s2p/en/offerings/documents/Swiss_Mandate_Line/Review/as2p_vvm_ch_gl_fix_chf__pdf/_jcr_content

See slide 23 in this presentation for details on how flushing works: http://www.slideshare.net/andrewmkhoury/aem-cq-dispatcher-caching-webinar-2013

Andrew Khoury
  • 408
  • 3
  • 9
0

Not sure if this is the root cause, but what I suspect you probably need to do, is to go to localhost:4503/etc/replication/agents.publish.html (note, this is a publish instance, you can do it on the author and replicate the replication agents et al, but for the purposes of the POC, just do it directly on the publisher.)

Then go to your dispatcher flush agent, and click on edit settings.

Go to the triggers panel.

Make sure that the "On Receive" trigger is checked. What this does is enable chain replication, meaning that when a direct asset is published, it is directly deleted from the dispatcher, causing a miss on the next request, and thus pulling a fresh copy from the dispatcher.

Note that this kind of flushing is distinct from the stats file level flushing, which only flushes a directory, rather than a fully qualified path to the asset.


By the way, it's not stats file level. The stats file level by default is 0 if it is commented out, which invalidates anything below. What you seem to be looking for is an active delete of the cache. This is possible, as Dave just outlined to me for an unrelated problem in this post: Is it possible to recursively flush directories in the CQ5/AEM apache dispatcher?

An approach would be to create a flush interceptor. Essentially a custom servlet on the publisher. What you would then do, is to configure the normal flush replicator to make a call to the local servlet on the publisher.

The servlet then detects whether it would need to delete the directory, or any particular files within. It can transform the flush path to the required path, and instead of a FLUSH action, use a DELETE action.

It would still be very important to send the flush to the normal dispatcher location.

Hope this helps.

Community
  • 1
  • 1
Bayani Portier
  • 660
  • 8
  • 18
  • Thanks for the answer, but the "On Receive" is checked and as I mentioned we don't have DAM Assets here, but binaries directly attached to a page. And with that of course only the page gets activated, expecting that the whole folder structure generated on the dispatcher gets evicted as well, which sadly isn't the case. – Thomas Jul 25 '14 at 06:20
  • Added another approach and explained the stats file level. – Bayani Portier Jul 25 '14 at 12:47