Python XML Parser – ElementTree

Filed Under: Python

Python XML parser provides us an easy way to read the XML file and extract useful data. Today we will look into python ElementTree XML API and learn how to use it to parse XML file as well as modify and create XML documents.

Python XML Parser – Python ElementTree

python xml parser, python elementtree
Python ElementTree is one of the most efficient APIs to extract, parse and transform XML data using Python programming language. In this post, we will have good look at how to create, read, parse and update XML data in files and programmatically.

Let’s get started with Python XML parser examples using ElementTree.

Python ElementTree examples

We will start with a very simple example to create an XML file programmatically and then, we will move to more complex files.

Creating XML file

In this example, we will create a new XML file with an element and a sub-element. Let’s get started straight away:


import xml.etree.ElementTree as xml

def createXML(filename):
    # Start with the root element
    root = xml.Element("users")
    children1 = xml.Element("user")
    root.append(children1)

    tree = xml.ElementTree(root)
    with open(filename, "wb") as fh:
        tree.write(fh)


if __name__ == "__main__":
    createXML("test.xml")

Once we run this script, a new file will be created in the same directory with file named as test.xml with following contents:


<users><user /></users>

There are two things to notice here:

  • While writing the file, we used wb mode instead of w as we need to write the file in binary mode.
  • The child user tag is a self-closing tag as we haven’t put any sub-elements in it.

Adding values to XML elements

Let’s improve the program by adding values to the XML elements:


import xml.etree.ElementTree as xml

def createXML(filename):
    # Start with the root element
    root = xml.Element("users")
    children1 = xml.Element("user")
    root.append(children1)

    userId1 = xml.SubElement(children1, "id")
    userId1.text = "123"

    userName1 = xml.SubElement(children1, "name")
    userName1.text = "Shubham"

    tree = xml.ElementTree(root)
    with open(filename, "wb") as fh:
        tree.write(fh)


if __name__ == "__main__":
    createXML("test.xml")

Once we run this script, we will see that new elements are added added with values. Here are the content of the file:


<users>
    <user>
        <id>123</id>
        <name>Shubham</name>
    </user>
</users>

This is perfectly valid XML and all tags are closed. Please note that I formatted the XML myself as the API writes the complete XML in a single fine which is kind a bit, incompleteness!

Now, let’s start with editing files.

Editing XML data

We will use the same XML file we showed above. We just added some more data into it as:


<users>
    <user>
        <id>123</id>
        <name>Shubham</name>
        <salary>0</salary>
    </user>
    <user>
        <id>234</id>
        <name>Pankaj</name>
        <salary>0</salary>
    </user>
    <user>
        <id>345</id>
        <name>JournalDev</name>
        <salary>0</salary>
    </user>
</users>

Let’s try and update salaries of each user:


import xml.etree.ElementTree as xml

def updateXML(filename):
    # Start with the root element
    tree = xml.ElementTree(file=filename)
    root = tree.getroot()

    for salary in root.iter("salary"):
        salary.text = '1000'
 
    tree = xml.ElementTree(root)
    with open("updated_test.xml", "wb") as fh:
        tree.write(fh)

if __name__ == "__main__":
    updateXML("test.xml")

It is worth noticing that if you try to update an elements value to an integer, it won’t work. You will have to assign a String, like:


salary.text = '1000'

instead of doing:


salary.text = 1000

Python XML Parser Example

This time, let’s try to parse the XML data present in the file and print the data:


import xml.etree.cElementTree as xml
 
def parseXML(file_name):
    # Parse XML with ElementTree
    tree = xml.ElementTree(file=file_name)
    print(tree.getroot())
    root = tree.getroot()
    print("tag=%s, attrib=%s" % (root.tag, root.attrib))
 
    # get the information via the children!
    print("-" * 40)
    print("Iterating using getchildren()")
    print("-" * 40)
    users = root.getchildren()
    for user in users:
        user_children = user.getchildren()
        for user_child in user_children:
            print("%s=%s" % (user_child.tag, user_child.text))
 
if __name__ == "__main__":
    parseXML("test.xml")

When we run above script, below image shows the output produced.
python elementtree example, python xml parser example

In this post, we studied how to extract, parse, and transform XML files. ElementTree is one of the most efficient APIs to do these tasks. I would suggest you try some more examples of XML parsing and modifying different values in XML files.

Reference: API Doc

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages