Scala XML Processing – Literals, Serialization, Parsing, Save and Load Examples

Filed Under: Scala

XML is a form of semi structured data which is organized in the form of trees. Semi structured data is helpful when you serialize the program data for saving in a file or shipping across a network. It defines a standardized document which is easy to read an interpret. XML stands for eXtensible Markup Language.

XML consists of two basic elements text and tags. Text is a sequence of characters. Tags consists of a less than sign alphanumeric character and greater than sign. An end tag is same as start tag except that it consists of a slash in the end. Start tag and end tag must have the same label.

For example;


<school>
<standard>4</standard>
</school>

Above is valid XML as the start and end tag match each other.


<school><standard>6</standard> 7

Above is invalid XML as the end tag is not specified.


<school><standard>8 </school></standard>

Above XML is also invalid because the standard tag which is the child should be closed first and then the parent tag school should be closed.

Since tags have to be matched, XML are structured as nested elements. The start and end tags forms a pair of matching elements and elements can be nested within each other. In the above example standard is the nested element.

The shorthand notation which is the start tag followed by the slash indicates the start and end tag. One tag with a slash indicates an empty element.

For instance in below XML standard is an empty element.


<school> <standard /> </school>

Start tags can have attributes. An attribute is a name value pair with an equal sign in the middle. The attribute is surrounded by double quotes or single quotes.

For instance


<standard section ="A" strings = "true"></standard>

Now that we have a brief knowledge of XML, let’s look over different things we can do in Scala for XML processing.

Scala XML Literals

Type a start tag and then continue writing the XML content. The XML contents are read until the end tag is seen.

For example, Open the Scala REPL shell and execute the code as


<a>Scala is a functional Programming language</a>

Scala expression can be evaluated in the tag value using curly braces. For example;


<a> {"hi"+",Reena"} </a>

Output: res1: scala.xml.Elem = <a> hi,Reena </a>

A brace escape can include arbitrary scala content including XML literals. For example;


val marks = 78

<a> { if ( marks < 80) <marks> {marks} </marks> else xml.NodeSeq.Empty } </a>

Output: res3: scala.xml.Elem = <a> <marks> 78 </marks> </a>

The code inside the curly braces are evaluated to an XML node or a sequence of XML nodes. In the above example if the marks is less than 80 it is added to <a> element else nothing is added.

The expression inside the brace is evaluated to a scala value and then converted to string and inserted as text.


<a> {9+40} </a>

Output: res4: scala.xml.Elem = <a> 49 </a>

The <, >, and & characters in the text will be escaped if you print the node.


<a> {"</a>Hello Scala<a>"} </a>

Output: res5: scala.xml.Elem = <a> &lt;/a&gt;Hello Scala&lt;a&gt; </a>

Below image shows all the above Scala XML Literals processing in scala shell.

Scala-XML-Literals

Serialization in Scala

Serialization converts the internal data structure to XML so that the data can be stored, transmitted or reused. Use XML literals and brace escapes to convert to XML. Use the toXML method that supports XML literals and brace escapes.

For example first of all we will define Student class and create an instance of it.


scala> abstract class Student {
 	val name:String
 	val id:Int
 	val marks:Int
 	override def toString = name
 	 
 	def toXML =
 	<student>
 	<name>{name}</name>
 	<id>{id}</id>
 	<marks>{marks}</marks>
 	</student>
 	}

scala> val stud = new Student {
 	val name = "Rob"
 	val id = 12
 	val marks =90
 	}

scala> stud.toXML
res7: scala.xml.Elem =
<student>
     	<name>Rob</name>
 	<id>12</id>
 	<marks>90</marks>
	</student>

Below image shows the scala serialization process in scala shell.

Scala-XML-Serialization

Scala XML Parsing

There are many methods available for XML classes. Let us now see a very useful method as how to extract text, sub elements and attributes.

Extracting Text
The text method on the XML node retrieves the text within that node. For example;


scala> <a>Scala is a <p>programming</p> language </a>.text
Output: res8: String = "Scala is a programming language "

Here the tags are excluded from the output.

Extracting sub-elements

The sub elements are extracted by calling \\ followed by tag name. For example;


scala> <school><standard><section>C</section></standard></school> \\"section"
Output:res21: scala.xml.NodeSeq = NodeSeq(<section>C</section>)

scala> <school><standard><section>C</section></standard></school> \\"school"
Output:res22:scala.xml.NodeSeq = NodeSeq(<school><standard><section>C</section></standard></school>)

Below image shows the above xml parsing examples in scala shell.

Scala-XML-Parsing

Scala Extracting XML attributes

Tag attributes are extracted using the same \ and \\ methods with an at sign (@) before the attribute name. For example;


scala> val adam = <student
 	name = "Adam"
 	id ="12"
 	marks = "65" />
Output:adam: scala.xml.Elem = <student name="Adam" id="12" marks="65"/>

scala> adam \\"@name"
Output:res3: scala.xml.NodeSeq = NodeSeq(Adam)

scala> adam \\"@iduct"
Output:res5: scala.xml.NodeSeq = NodeSeq(12)

Scala-XML-Attribute-Parsing

Scala De-serialization example

The XML is converted back to the internal data structure for the program to use. For example;

The Student class created during serialization process shall be used as the student class and the toXML methods are used.


scala> def fromXML(node: scala.xml.Node): Student =
new Student {
    
      val name   = (node  \"name").text
       val id      = (node  \"id").text.toInt
       val marks  = (node  \"marks").text.toInt
     }

Output: fromXML: (node: scala.xml.Node)Student

Now call the stud created in the serialization and print the xml content as below.


scala> val stud = new Student {
 	val name = "Rob"
 	val id = 12
 	val marks =90
 	}

Now invoke toXML method as;


scala> val st = stud.toXML
st: scala.xml.Elem =
<student>
 	<name>Rob</name>
 	<id>12</id>
 	<marks>90</marks> 
 	</student>

Call the fromXML method as;


scala>fromXML(st)
Output:res17: Student = Rob

Scala-XML-Deserialization

Scala XML Saving into file and Loading from file

The XML.saveFull command is used to convert data to a file of bytes. The first argument is the file name to which the node is to be saved, second is the node, third is the character encoding, fourth is whether to write an XML declaration at the top that includes the character encoding and finally the fifth is the document type.

For example;


scala> scala.xml.XML.save("stud.xml",st,"UTF-8",true,null)

We are using the st node created above in the de-serialization process.

Now open the stud.xml file which stores the following contents:


<?xml version='1.0' encoding='UTF-8'?>
<student>
 	<name>Rob</name>
 	<id>12</id>
 	<marks>90</marks>
 	</student>

Now for loading the file we can use the load method as;


scala> val s1 = xml.XML.load("stud.xml")
s1: scala.xml.Elem =
<student>
 	<name>Rob</name>
 	<id>12</id>
 	<marks>90</marks>
 	</student>

Scala-XML-Save-Load

That’s all for XML processing in Scala programming, we will look into more Scala features in coming posts.

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages