After parsing the following xml,
<html>
<body>
<a>
<div>
<span>foo</span>
</div>
</a>
</body>
</html>
the javax.xml.xpath Document parser indicates the following:
div
is the parent node ofa
a
is the parent node of ofspan
Why is this, and how can I properly parse this xml?
Here is the code I am using, followed by it's output.
String myxml = ""
+ "<html>"
+ "<body>"
+ "<a>"
+ "<div>"
+ "<span>foo</span>"
+ "</div>"
+ "</a>"
+ "</body>"
+ "</html>";
Document doc = HttpDownloadUtilities.getWebpageDocument_fromSource(myxml);
XPath xPath = XPathFactory.newInstance().newXPath();
Node node = ((Node)xPath.compile("//*[text() = 'foo']").evaluate(doc, XPathConstants.NODE));
System.out.println(" node tag: " + node.getNodeName());
System.out.println(" parent tag: " + node.getParentNode().getNodeName());
System.out.println("grandparent tag: " + node.getParentNode().getParentNode().getNodeName());
Set<Node> nodes = H.getSet((NodeList)xPath.compile("//*").evaluate(doc, XPathConstants.NODESET));
for (Node n : nodes) {
System.out.println();
try {
System.out.println("node: " + n.getNodeName());
} catch (Exception e) {
}
try {
System.out.println("child: " + n.getChildNodes().item(0).getNodeName());
} catch (Exception e) {
}
}
output:
node tag: span
parent tag: a
grandparent tag: div
node: html
child: head
node: head
node: body
child: html
node: html
child: body
node: body
child: a
node: a
node: div
child: a
node: a
child: span
node: span
child: #text
Aucun commentaire:
Enregistrer un commentaire