Read XML file in R programming language

Filed Under: R Programming
READ XML FILE IN R

The XML file is known as (Extensible Markup language). The XML files look more like an HTML document that uses tags to define the objects and you can also use it as a text-based database. In this article, we will focus on how to read an XML file in the R language.

First things first, install the required packages. R offers “XML” package to read the XML file in R.

install.packages('XML')

How to read an XML file in R?

Now, we can create an XML file and then read the same file using the XML library. You can see the data present in the XML file in the below image.

As you can see here, the XML file will store the data with markup tags. If you are ready with the XML file, then let’s read it in R.

#Import the library
library(XML)

#Import the library
library(methods)

#Parses the XML file
output <- xmlParse(file = 'r_test.xml')

#Display the output 
print(output)
Dil diya galla
Arijith Singh
India
Tseries
10.90
2018

Saiyara
Atif Aslam
Uk
Records
9.90
2015

Khairiyat
Sonu Nigam
India
Radio
9.90 
2019

Finding the Size of the node

Now, in this section, we will find the size of the nodes present in the XML file. The XML library has multiple functions that will assist you in different circumstances.

Let’s use “xmlRoot” and “xmlSize” functions to find the size of the nodes in the given file.

#Import the library
library(XML)

#Import the library
library(methods)

#Parses the XML file
output <- xmlParse(file = 'r_test.xml')

#Getting the nodes
node <- xmlRoot(output)

#Getting the size of nodes
size <- xmlSize(node)

#Display result
print(size)
3

Well, we got the size of the nodes. You can confirm it against the input file in the above sections. With just a couple of lines of code, you can enter and loop through the XML files with ease in R language.

Reading the first node in R

Using the “xmlRoot” function you can get the information or data present in the first node. You can even specify the node number to get the data from that specific node in the XML file.

Let’s see hot it works.

#Import the library
library(XML)
#Import the library
library(methods)
#Parses the XML file
output <- xmlParse(file = 'r_test.xml')
#Getting the nodes
node <- xmlRoot(output)
#Display result
print(node[1])
$CD

Dil diya galla
Arijith Singh
India
Tseries
10.90
2018

attr(,"class")
[1] "XMLInternalNodeList" "XMLNodeList" 

In the same way, you can try specifying the node number to get the data present in that particular node. As simple as that!.

Convert XML file to Dataframe in R

Yes, you heard it right. XML library includes a function named “XMLtoDataframe”, which will convert the data in the XML file to a data frame in no time.

Come, let’s do it!

#Import the library
library(XML)

#Import the library
library(methods)

#XML data to dataframe
dataframe <- xmlToDataframe('r_test.xml')

#Display result
print(dataframe)
read xml file in R

Hey!! We got the data frame. Thanks to the XML package. It offers many functions using which you can parse the XML file, print the node size and also convert the XML data to a data frame.

Wrapping Up

The XML package is useful in reading XML files in R. The XML library includes many functions using which you can parse the XML file, read the node data and size as well. The best part is you can convert the XML file.

More read: R documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content