4

I'm using PHP xmlreader to validate and parse xml data. This xml is validated with some xsd schema from local file via XMLReader::setSchema function and remote xsd schema from http:// via xsd:import/include. Eveything work fine, but it fetch xsd schema from net and read from disk everytime when called.

So my questions is:

Is there a method for caching remote xsd schema in local RAM? For local schema files, I think tmpfs in Linux will work fine, but is there another way to cache local xsd schema files ?

Solution

Thank VolkerK for pointing out the xmlcatalog system. It work fine with libxml/php xmlreader. In Linux, just edit file /etc/xml/catalog (It come from xml-common when you are in Fedora) add some entries like (for example):

<rewriteURI uriStartString="http://schemas.xmlsoap.org/soap/envelope/" rewritePrefix="/etc/xml/SOAP-Envolope.xsd"/>
<rewriteURI uriStartString="http://schemas.xmlsoap.org/soap/encoding/" rewritePrefix="/etc/xml/SOAP-Encoding.xsd"/>

and manual download schema (e.g http://schemas.xmlsoap.org/soap/encoding/ -> /etc/xml/SOAP-Encoding.xsd) then php xmlreader work like expected when parsing SOAP Messages.

hdang
  • 608
  • 5
  • 9
  • Is this question "only" about `setSchema('localfile') vs setSchema('http://something')` or do you xsd:import/include other schema files that you also want to cache? – VolkerK Jul 26 '11 at 07:22
  • of course it's xsd:import/include, For setSchema('http://') it's easy to implement my own cache system. – hdang Jul 26 '11 at 07:31
  • Here's the dummy-guide on adding the xml catalog: 1. create _/etc/catalog_. 2. `xmlcatalog --create --noout --add "rewriteURI" "http://schemas.xmlsoap.org/soap/envelope/" \ "file:///etc/xml/soap-envelope-1.1.xsd" \ /etc/xml/catalog` 3. restart apache. – Christof Oct 20 '11 at 13:23

1 Answers1

3

php's xmlreader uses libxml and libxml supports xml catalouges:

What is a catalog? Basically it's a lookup mechanism [...]
It is basically used for 3 things:
[...]
  • providing a local cache mechanism allowing to load the entities associated to public identifiers or remote resources, this is a really important feature for any significant deployment of XML or SGML since it allows to avoid the aleas and delays associated to fetching remote resources.

Haven't tried it but I guess it's worth a test run.

VolkerK
  • 95,432
  • 20
  • 163
  • 226
  • Its sound is very very good. If php xmlreader work with xmlcatalog, It's seems not only resolve the caching issue but also my schema version issue because (as i understand) i dont want to explicitly call setSchema function to load my own shema (the issue here is that i dont know which version of schema to load), just define in catalog and let libxml auto lookup it! is that correct ? I will try and feedback soon, thank you! – hdang Jul 26 '11 at 09:17