Java XPath Example Tutorial

Filed Under: Java

Welcome to Java XPath Example Tutorial. XPath provides syntax to define part of an XML document. XPath Expression is a query language to select part of the XML document based on the query String. Using XPath Expressions, we can find nodes in any xml document satisfying the query string.

Java XPath

java xpath example tutorial
javax.xml.xpath package provides XPath support in Java. To create XPathExpression, XPath API provide factory methods.


XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile(XPATH_EXPRESSION_STRING);
Object result = expr.evaluate(Object item, QName returnType);

Java XPath supported return types are defined in XPathConstants class.

  1. XPathConstants.STRING
  2. XPathConstants.NUMBER
  3. XPathConstants.BOOLEAN
  4. XPathConstants.NODE
  5. XPathConstants.NODESET

Java XPath Example

In this java XPath example tutorial, we have following sample xml file.

employees.xml


<?xml version="1.0" encoding="UTF-8"?>
<Employees>
	<Employee id="1">
		<age>29</age>
		<name>Pankaj</name>
		<gender>Male</gender>
		<role>Java Developer</role>
	</Employee>
	<Employee id="2">
		<age>35</age>
		<name>Lisa</name>
		<gender>Female</gender>
		<role>CEO</role>
	</Employee>
	<Employee id="3">
		<age>40</age>
		<name>Tom</name>
		<gender>Male</gender>
		<role>Manager</role>
	</Employee>
	<Employee id="4">
		<age>25</age>
		<name>Meghna</name>
		<gender>Female</gender>
		<role>Manager</role>
	</Employee>
</Employees>

We will implement following functions in our Java XPath example program.

  1. A function that will return the Employee Name for input ID.
  2. Return list of Employees Name with age greater than the input age.
  3. Return list of Female Employees Name.

Here is the final implementation class for Java XPath example program.

XPathQueryExample.java


package com.journaldev.xml;


import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;


public class XPathQueryExample {

    public static void main(String[] args) {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder;
        Document doc = null;
        try {
            builder = factory.newDocumentBuilder();
            doc = builder.parse("/Users/pankaj/employees.xml");

            // Create XPathFactory object
            XPathFactory xpathFactory = XPathFactory.newInstance();

            // Create XPath object
            XPath xpath = xpathFactory.newXPath();

            String name = getEmployeeNameById(doc, xpath, 4);
            System.out.println("Employee Name with ID 4: " + name);

            List<String> names = getEmployeeNameWithAge(doc, xpath, 30);
            System.out.println("Employees with 'age>30' are:" + Arrays.toString(names.toArray()));

            List<String> femaleEmps = getFemaleEmployeesName(doc, xpath);
            System.out.println("Female Employees names are:" +
                    Arrays.toString(femaleEmps.toArray()));

        } catch (ParserConfigurationException | SAXException | IOException e) {
            e.printStackTrace();
        }

    }


    private static List<String> getFemaleEmployeesName(Document doc, XPath xpath) {
        List<String> list = new ArrayList<>();
        try {
            //create XPathExpression object
            XPathExpression expr =
                xpath.compile("/Employees/Employee[gender='Female']/name/text()");
            //evaluate expression result on XML document
            NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
            for (int i = 0; i < nodes.getLength(); i++)
                list.add(nodes.item(i).getNodeValue());
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }
        return list;
    }


    private static List<String> getEmployeeNameWithAge(Document doc, XPath xpath, int age) {
        List<String> list = new ArrayList<>();
        try {
            XPathExpression expr =
                xpath.compile("/Employees/Employee[age>" + age + "]/name/text()");
            NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
            for (int i = 0; i < nodes.getLength(); i++)
                list.add(nodes.item(i).getNodeValue());
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }
        return list;
    }


    private static String getEmployeeNameById(Document doc, XPath xpath, int id) {
        String name = null;
        try {
            XPathExpression expr =
                xpath.compile("/Employees/Employee[@id='" + id + "']/name/text()");
            name = (String) expr.evaluate(doc, XPathConstants.STRING);
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }

        return name;
    }

}

When we run above java XPath example program, it results in following output.


Employee Name with ID 4: Meghna
Employees with 'age>30' are:[Lisa, Tom]
Female Employees names are:[Lisa, Meghna]

Notice that first few lines are to read the XML file as Document. Then we are reusing the Document and XPath object in all the methods. Above program shows example of NODESET and STRING as result Object.

That’s all for a quick roundup on Java XPath.

Comments

  1. Parithy says:

    how to get the Xpath if we dont have any unique ID,elements, class in the page.?

  2. sathish says:

    I want to extract a span tag data from the url how can i get it through java

    example like if i search for an collage in google i want to extract the address and the contact details when i pass the particular collage or organisation from the java code

  3. Sarita says:

    Which Maven dependency we should be using if I’m creating mavenized project?

    regards,
    Sarita

  4. Samar says:

    Thank you, this was useful

    I want to ask about how can I find a value based on entered text by user?

    I need your help please for my project

    I would appreciate your help!

    regards,

    Samar

  5. Praveen says:

    Hi Pankaj,

    I need to get the XPath of XML using Java for the below example.

    Output: Class/Student/firstname
    Class/Student/lastname

    XML:

    dinkar
    kad
    dinkar
    85

    Vaneet
    Gupta
    vinni
    95

    jasvir
    singh
    jazz
    90

  6. Karim Dani says:

    Thanks for sharing…

    I am working on project where I need to get the xpath for elements from HTML, do you have any idea about it or if you can suggest something… what I need is as we have text field or label on page, I want the xpath for both the element available on page.

    I would appreciate your inputs.

    Thanks,
    Karim

  7. BioHazard says:

    Thank you for sharing!! 😀

  8. priyA says:

    how to convert a html to xml?

    1. Pankaj says:

      both are separate technologies, you might use some parser to read HTML data and then use that to set XML element values…

  9. raghav says:

    Hi Pankaj,

    Thanks for the information on your blog.
    I would like to know if there is any faster and memory efficient way of parsing a huge XMLs (over 1 GB size) and extracting only selected information based on XPATH expressions. As I understand, XPATH goes along with DOM only, and buidling a tree structure using DOM causes memory problems. Require your inputs on this challenge.

    Thanks
    raghav

    1. devendra says:

      Please use VTD XML parser which is fast and memory efficient.

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages