2018-09-17

xml的默认命名空间

最普通的xml是没有命名空间的，这时候用xpath解析，直接使用节点的nodename就可以了，比如如下的xml

<?xml version="1.0" encoding="UTF-8"?>
<books>
	<book>
	  <title>book1</title>
	  <author>zhangsan</author>
	</book>
	<book>
	  <title>book2</title>
	  <author>wanger</author>
	</book>
	<book>
	  <title>book3</title>
	  <author>zhangsan</author>
	</book>
</books>

用DOM4j+xpath来解析是很简单的，比如我们想搜索所有作者是zhangsan的书示例代码如下

//代码1
Document document = DocumentHelper.parseText(str);
//方法1
List<Node> list = document.selectNodes("//book/author[text()=\"zhangsan\"]");
//方法2
XPath x = document.createXPath("//book/author[text()=\"zhangsan\"]");
List<Node> list = x.selectNodes(document);

for (Node node : list) {
	System.out.println(((List<Node>)node.getParent().elements("title")).get(0).getText());
}

不论使用方法1还是方法2，最后打印出来的结果都是

book1
book3

如果books节点上声明了namespace，并且子节点也都使用了对应的prefix，那也是很容易处理的，修改后的xml如下

<?xml version="1.0" encoding="UTF-8"?>
<books xmlns="http://www.springframework.org/schema/beans"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<xsi:book>
	  <xsi:title>book1</xsi:title>
	  <xsi:author>zhangsan</xsi:author>
	</xsi:book>
	<xsi:book>
	  <xsi:title>book2</xsi:title>
	  <xsi:author>wanger</xsi:author>
	</xsi:book>
	<xsi:book>
	  <xsi:title>book3</xsi:title>
	  <xsi:author>zhangsan</xsi:author>
	</xsi:book>
</books>

这时候只要对上面的xpath稍作修改，在nodename前加上指定的prefix就可以实现同样的效果，示例代码如下

//代码2
Document document = DocumentHelper.parseText(str);
//方法1
List<Node> list = document.selectNodes("//xsi:book/xsi:author[text()=\"zhangsan\"]");

//方法2
XPath x = document.createXPath("//xsi:book/xsi:author[text()=\"zhangsan\"]");
List<Node> list = x.selectNodes(document);

但是如果books节点上声明了namespace，但是子节点并没有使用对应的prefix，那处理起来就有点麻烦了，示例的xml如下

<?xml version="1.0" encoding="UTF-8"?>
<books xmlns="http://www.springframework.org/schema/beans"   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  >
	<book>
	  <title>book1</title>
	  <author>zhangsan</author>
	</book>
	<book>
	  <title>book2</title>
	  <author>wanger</author>
	</book>
	<book>
	  <title>book3</title>
	  <author>zhangsan</author>
	</book>
</books>

这时候，像代码1那样使用不带namespace的xpath，是搜索不到结果的，如果想使用代码2，又不知道该如何指定namespace。这种情况，就需要用到默认namespace了，有两种方法可以设置默认的namespace，这两种方法都需要一个Map类型的对象，用于存储namesapce的URI和默认的prefix的对应关系

//代码3
Map<String, String> nsContext = new HashMap<String, String>();
nsContext.put("p", "http://www.springframework.org/schema/beans");
//方法1
DocumentFactory.getInstance().setXPathNamespaceURIs(nsContext);
Document document = DocumentHelper.parseText(str);
List<Node> list = document.selectNodes("//p:book/p:author[text()=\"zhangsan\"]");

//方法2
Document document = DocumentHelper.parseText(str);
XPath x = document.createXPath("//p:book/p:author[text()=\"zhangsan\"]");
x.setNamespaceURIs(nsContext);
List<Node> list = x.selectNodes(document);

再运行代码，又可以获得跟代码1和代码2同样的效果了。

上下而求索

生命不息，学习不止

xml的默认命名空间