Namespace-aware XPath expression in Java with Saxon


In one program I am working on, I need to execute several XPath expressions from Java code. I was using the built-in JAXP reference implementation that ships with the JDK. The XML documents I build the expressions for have a default namespace and a quick solution to get this working with JAXP was to disable namespace awareness of the expressions. As it turned out however, I also need XPath 2.0 support and the JAXP RI only supports XPath 1.0. Fortunately, a mature and free XPath 2.0 (and more) implementation is available: Saxon HE. Using this implementation is fairly simple. You download the jar file, put it in your classpath and create your factories from Saxon and not from the RI. However, Saxon doesn’t seem to allow to disable namespace awareness which meant that all my existing XPath expressions returned zero nodes. Interestingly, attribute selection still worked fine. Still, I had to make my expressions and their execution code namespace-aware. Despite being quite simple, it took some time to set this up and this is why I wrote this post: A very simple example for using namespace-aware XPath expressions with Saxon. Here is a main class that does this:

import java.io.ByteArrayInputStream;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import javax.xml.xpath.XPathFactoryConfigurationException;

import net.sf.saxon.lib.NamespaceConstant;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;

public class SaxonTest {

	public static void main(String[] args)
			throws XPathFactoryConfigurationException,
			XPathExpressionException, ParserConfigurationException,
			SAXException, IOException {
		System.setProperty("javax.xml.xpath.XPathFactory:"
				+ NamespaceConstant.OBJECT_MODEL_SAXON,
				"net.sf.saxon.xpath.XPathFactoryImpl");

		DocumentBuilderFactory docFactory = DocumentBuilderFactory
				.newInstance();
		DocumentBuilder builder = docFactory.newDocumentBuilder();
		Document doc = builder.parse(new ByteArrayInputStream(
				"<book xmlns=\"http://my-ns.com\" name=\"foobar\">test</book>"
						.getBytes()));

		XPathFactory factory = XPathFactory
				.newInstance(NamespaceConstant.OBJECT_MODEL_SAXON);
		XPath xpath = factory.newXPath();
		xpath.setNamespaceContext(new MyNamespaceContext());

		XPathExpression expr = xpath.compile("//myNs:book");
		Object result = expr.evaluate(doc);
		System.out.println(result);

	}

}

Lines 24 – 26 and 35 – 36 is the code that is necessary to use Saxon and not the RI. The lines inbetween construct a specific XML document with a default namespace. The XML is read from a String and not an external file to keep the example more self-contained. Lines 37 and 38 construct a new XPath expression and set the namespace via a custom class (see code below). Lines 40 – 42 execute a namespace qualified XPath expression that selects all occurences of the node with the according namespace. There is nothing XPath 2.0-specific about the expression, but thanks to Saxon you could use the additional XPath 2.0 functions there. If everything is set up properly, line 42 prints the content of the only node (which is “test”) to the console.

The interesting part is the one that sets the custom namespace (line 38). You can achieve this by writing your own NamespaceContext; that is, by implementing the NamespaceContext interface. Here is a very basic implementation that works for the example:

import java.util.Iterator;

import javax.xml.namespace.NamespaceContext;

public class MyNamespaceContext implements NamespaceContext {

	@Override
	public String getNamespaceURI(String prefix) {
		if ("myNs".equals(prefix)) {
			return "http://my-ns.com";
		}
		return null;
	}

	@Override
	public String getPrefix(String namespaceURI) {
		return null;
	}

	@SuppressWarnings("rawtypes")
	@Override
	public Iterator getPrefixes(String namespaceURI) {
		return null;
	}

}

The only thing you definitly need in this example is a way to resolve the namespace from its prefix (the one that is used in the XPath expression). This is implemented in the getNamespaceURI(String prefix) method. More complex use cases might also require the reverse and in this case the other methods can be useful. Interestingly, the getPrefixes() method returns a non-generified Iterator which results in a warning for this method. I cannot think of a single reason for why the interface is not generified, maybe it has been overlooked. Anyway, the warning can easily be suppressed using the SuppressWarnings("unchecked") annotation.

Advertisements

One thought on “Namespace-aware XPath expression in Java with Saxon

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s