I have two approach of validating my xml against xsd which is stored in resource of my legacy application. Validations are done 1000+ times daily and code runs 24*7.
Approach 1: Is to create static SchemaFactory
public class XmlValidator {
private static final SchemaFactory schemaFactory;
static {
schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI, "com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory", ClassLoader.getSystemClassLoader()); // This is done because of conflict due to xerces from //external jar and from Java
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
}
public boolean validateXmlWithXsd(String inputXml, String xsd) {
try (InputStream stream = new ByteArrayInputStream(xsd.getBytes(StandardCharsets.UTF_8));
StringReader reader = new StringReader(inputXml)) {
Source schemaFile = new StreamSource(stream);
Schema schema = schemaFactory.newSchema(schemaFile);
Validator validator = schema.newValidator();
validator.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
validator.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
Source source = new StreamSource(reader);
validator.validate(source);
return true; // Validation successful
} catch (Exception e) {
// Handle validation errors here
e.printStackTrace();
return false; // Validation failed
}
}
}
Approach 2: (just changing method validateXmlWithXsd without static block )
public boolean validateXmlWithXsd (String inputXml, String xsd) {
try{
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI, "com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory", ClassLoader.getSystemClassLoader())
InputStream stream = new ByteArrayInputStream (xsd.getBytes(StandardCharsets.UTF_8));
Source schemaFile = new StreamSource(stream);
Schema schema = schemaFactory.newSchema(schemaFile);
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
Validator validator = schema. newValidator();
validator.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
validator.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
Source source = new StreamSource(new StringReader (inputXml)):
validator.validate(source);
}
}
My thoughts: Calling SchemaFactory.newInstance with the system class loader, especially when invoked repeatedly in a long-running thread (1000 times + in a 24*7 scenario), can potentially lead to performance issues and might not be the most efficient approach. Mainly due to Class Loading Overhead. So, I prefer approach 1. Also, In the case of a static SchemaFactory, I think it helps with memory:
Single Instance: When I declare a static SchemaFactory, there's only one instance of it shared across all instances of class. This means create the SchemaFactory only once, and all subsequent calls to XML validation method use the same factory.
Resource Sharing: The SchemaFactory is relatively heavy to create, and it can be configured with various properties. By making it static, we avoid recreating it each time we need to validate XML. This saves memory and CPU cycles.
Approach 2: If we create the SchemaFactory within the validateXmlWithXsd() method, it will become eligible for garbage collection once the method exits, and the memory occupied by that instance may be freed. So, this approach might not pose significant memory issues.
Can anyone please suggest if Approach 1 has any disadvantage over approach 2.