Java Convert String to XML Document and XML Document to String

Filed Under: Java

Sometimes while programming in java, we get String which is actually an XML and to process it, we need to convert it to XML Document (org.w3c.dom.Document). Also for debugging purpose or to send to some other function, we might need to convert Document object to String.

Here I am providing two utility functions.

  1. Document convertStringToDocument(String xmlStr): This method will take input as String and then convert it to DOM Document and return it. We will use InputSource and StringReader for this conversion.
  2. String convertDocumentToString(Document doc): This method will take input as Document and convert it to String. We will use Transformer, StringWriter and StreamResult for this purpose.

package com.journaldev.xml;

import java.io.StringReader;
import java.io.StringWriter;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.InputSource;

public class StringToDocumentToString {

    public static void main(String[] args) {
        final String xmlStr = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"+
                                "<Emp id=\"1\"><name>Pankaj</name><age>25</age>\n"+
                                "<role>Developer</role><gen>Male</gen></Emp>";
        Document doc = convertStringToDocument(xmlStr);
        
        String str = convertDocumentToString(doc);
        System.out.println(str);
    }

    private static String convertDocumentToString(Document doc) {
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer;
        try {
            transformer = tf.newTransformer();
            // below code to remove XML declaration
            // transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            StringWriter writer = new StringWriter();
            transformer.transform(new DOMSource(doc), new StreamResult(writer));
            String output = writer.getBuffer().toString();
            return output;
        } catch (TransformerException e) {
            e.printStackTrace();
        }
        
        return null;
    }

    private static Document convertStringToDocument(String xmlStr) {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();  
        DocumentBuilder builder;  
        try  
        {  
            builder = factory.newDocumentBuilder();  
            Document doc = builder.parse( new InputSource( new StringReader( xmlStr ) ) ); 
            return doc;
        } catch (Exception e) {  
            e.printStackTrace();  
        } 
        return null;
    }

}

When we run above program, we get the same String output that we used to create DOM Document.


<?xml version="1.0" encoding="UTF-8"?><Emp id="1"><name>Pankaj</name><age>25</age>
<role>Developer</role><gen>Male</gen></Emp>

You can use replaceAll("\n|\r", "") to remove new line characters from String and get it in compact format.

Comments

  1. Akshay A says:

    Can you provide me a example for “Document convertStringToDocument(String xmlStr)” I am currently working on one of the project where I need to convert string to xml format

  2. Carlos Barrantes says:

    Works great for me, thanks!

    1. Ram says:

      \n

      Above string am unable to parse it.Am getting below error.
      [Fatal Error] :1:15: The value following “version” in the XML declaration must be a quoted string.

  3. Nik says:

    Thanks a lot for sharing this. :).

  4. Raj Ashtaputre says:

    Why is this article still up and why is the author not responding to so many people saying it doesn’t work. The code does not work.

    1. Pankaj says:

      The code work fine, please check again. Let me know what is the issue you are facing.

      1. Suso says:

        The code does not work for me either. Even for the simplest string representing xml node, eg: “”.
        Has anyone find out why, please?

  5. Mahadev says:

    why the doc value is null
    then how we can get our xml data

    1. Pankaj says:

      Can you please specify the code you are talking about? If you are referring to “return null” in convertStringToDocument method, please note that it will be only in case of any exception.

  6. Mehul Kishor Fatnani says:

    Hi all,
    i am also getting same error on parsing…Null pointer exception gets genearted..Any help is highly appreciaed

    1. Pankaj says:

      Please post your code and exception stack trace.

  7. nekonutchi says:

    Where is the replaceAll() method supposed to be used?
    I was thinking it should be placed on the string str before printing it out, like so:
    String str = convertDocumentToString(doc);
    str.replaceAll(“\n|\r”, “”);
    System.out.println(str);

    But the output doesn’t change…

    1. Shailesh says:

      Same problem it doesn’t work…!

      1. Pankaj says:

        return output.replaceAll("\n|\r", ""); in convertDocumentToString method. Come on guys, use some brains yourself too.

  8. mmonikm says:

    The variable doc allways return null

    1. Sridhar Raj says:

      Even for me 🙁

      1. Pankaj says:

        Please check the method carefully, that’s only in case of an exception.

  9. ahmad says:

    successfully executed but did not found useful. I want convert doc file into xml

    1. sunil says:

      hi can u please tell me how to convert doc file into xml using java code

      1. sunil says:

        package com.avankia.sunil;

        import java.io.ByteArrayInputStream;
        import java.io.File;
        import java.io.FileInputStream;
        import java.io.FileOutputStream;
        import java.util.logging.Level;
        import java.util.logging.Logger;

        import javax.xml.parsers.DocumentBuilder;
        import javax.xml.parsers.DocumentBuilderFactory;
        import javax.xml.transform.Transformer;
        import javax.xml.transform.TransformerFactory;
        import javax.xml.transform.dom.DOMResult;
        import javax.xml.transform.dom.DOMSource;

        import org.w3c.dom.Document;
        public class DocToXmlResumeConvertor {

        // get path of xsl file
        private static String styleSheetPath = SystemManager.getInstance()
        .getConfigUrl().getPath()
        + “xhtml2fo.xsl”;
        // static String styleSheetPath = null;
        static java.util.logging.Logger logger = Logger
        .getLogger(DocToXmlResumeConvertor.class.getName());

        private static Document xml2FO(Document xml, String styleSheetPath)
        throws Exception
        {
        DOMSource xmlDomSource = new DOMSource(xml);
        DOMResult domResult = new DOMResult();
        Transformer transformer = getTransformer(styleSheetPath);
        if (transformer == null)
        {
        throw new Exception(“Error in creating trnasformer”);
        }

        try
        {
        transformer.transform(xmlDomSource, domResult);
        }
        catch (javax.xml.transform.TransformerException e)
        {
        logger.log(Level.INFO, “Error in transforming xml to xsl-fo: ”
        + e.getMessage());
        return null;
        }

        return (Document) domResult.getNode();
        }

        private static Transformer getTransformer(String styleSheetPath)
        {
        try
        {
        TransformerFactory tFactory = TransformerFactory.newInstance();
        DocumentBuilderFactory dFactory = DocumentBuilderFactory
        .newInstance();
        dFactory.setNamespaceAware(true);
        DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
        Document xslDoc = dBuilder.parse(new File(styleSheetPath));
        logger.log(Level.INFO, xslDoc.getTextContent());
        DOMSource xslDomSource = new DOMSource(xslDoc);
        return tFactory.newTransformer(xslDomSource);
        }
        catch (javax.xml.transform.TransformerException e)
        {
        logger.log(Level.SEVERE, “”, e);
        return null;
        }
        catch (java.io.IOException e)
        {
        logger.log(Level.SEVERE, “”, e);
        return null;
        }
        catch (javax.xml.parsers.ParserConfigurationException e)
        {
        logger.log(Level.SEVERE, “”, e);
        return null;
        }
        catch (org.xml.sax.SAXException e)
        {
        logger.log(Level.SEVERE, “”, e);
        return null;
        }
        }

        /*
        private static byte[] fo2PDF(Document foDocument)
        {
        FopFactory fopFactory = FopFactory.newInstance();
        try
        {

        ByteArrayOutputStream out = new ByteArrayOutputStream();
        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out);
        TransformerFactory tFactory = TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        Source src = new DOMSource(foDocument);
        Result res = new SAXResult(fop.getDefaultHandler());
        transformer.transform(src, res);
        return out.toByteArray();
        }
        catch (Exception ex)
        {
        logger.log(Level.SEVERE, “”, ex);
        return null;
        }
        }
        */

        public static byte[] getXmlResumeBytes(byte[] bytes) throws Exception
        {
        byte[] XmlBytes = null;
        ByteArrayInputStream input = new ByteArrayInputStream(bytes);
        //final HtmlCleaner cleaner = new HtmlCleaner();
        CleanerProperties props = cleaner.getProperties();
        DomSerializer doms = new DomSerializer(props, true);
        Document xmlDoc = null;

        try
        {
        TagNode node = cleaner.clean(input, “UTF-8”);
        xmlDoc = doms.createDOM(node);
        // System.out.println(xmlDoc.getFirstChild().getTextContent());
        }
        catch (Exception e)
        {
        throw e;
        }
        Document foDoc = null;
        try
        {
        foDoc = xml2FO(xmlDoc, styleSheetPath);
        // System.out.println(foDoc.getFirstChild().getTextContent());
        }
        catch (Exception e)
        {
        logger.log(Level.INFO, “ERROR: ” + e.getMessage());
        throw e;
        }
        //XmlBytes = fo2PDF(foDoc);
        input.close();
        if (XmlBytes != null)
        {
        logger.log(Level.INFO, “your doc has been converted into xml”);
        }
        else
        {
        String errorString = “doc File is not converted into xml properly”;
        XmlBytes = errorString.getBytes();
        }
        return XmlBytes;
        }

        public static byte[] readBytes(String fileName)
        {

        FileInputStream fileInputStream = null;
        byte[] bytes = null;
        try
        {
        File file = new File(fileName);
        System.out.println(fileName);
        bytes = new byte[(int) file.length()];
        fileInputStream = new FileInputStream(file);
        fileInputStream.read(bytes);
        fileInputStream.close();
        return bytes;

        }
        catch (Exception ie)
        {
        bytes = null;
        logger.log(Level.SEVERE, “”, ie);
        return bytes;
        }
        }

        public static void main(String[] args) {
        // TODO Auto-generated method stub

        String htmlFileName = “C://Users//raktim//Downloads//ava.doc”;
        styleSheetPath = “D:/WORKAREA/AVANKIA/ResumeParser/src/www/WEB-INF/conf/xhtml2fo.xsl”;
        File htmlFile = new File(htmlFileName);
        byte[] XmlBytes = new byte[(int) htmlFile.length()];
        File XmlFile = new File(htmlFileName.replace(“.doc”, “.Xml”));
        FileOutputStream fop = null;
        try
        {
        pdfBytes = readBytes(htmlFileName);
        fop = new FileOutputStream(XmlFile);
        byte[] newBytes = DocToXmlResumeConvertor
        .getXmlResumeBytes(pdfBytes);
        fop.write(newBytes);
        fop.flush();
        fop.close();
        System.out.println(“Done”);
        }
        catch (Exception e)
        {
        logger.log(Level.SEVERE, “”, e);
        }

        }

        }

  10. Anuj says:

    Getting the null from builder.parse( new InputSource( new StringReader( xmlStr ) ) ); .. I validated my xml, it’s valid

  11. RInu says:

    Getting below error for

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    Exception in thread “main” javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created
    at javax.xml.parsers.FactoryFinder.findServiceProvider(Unknown Source)
    at javax.xml.parsers.FactoryFinder.find(Unknown Source)
    at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source)

  12. ragu says:

    Hi,

    Am using the above code example but getting null value returning in document

  13. Deepu says:

    Thanks Pankaj you r a lifesaver

  14. German says:

    Hi, this line
    Document doc = builder.parse( new InputSource( new StringReader( xmlStr ) ) );

    Gives me an error when i’m running…

    Fatal Error: XML document structures must start and end within the same entity.

    My xmlStr = ”
    1
    2
    3
    “;

    There is something I’m not doing right? Thank you for you help and article!

    1. Pankaj says:

      Your string is not a valid xml.

  15. simran says:

    I need to convert XML String to XML SAX document..how can that be done?

  16. Rishi Naik says:

    I m using same code but stringwriter truncates before printing entire xml in to string..if u can help why it is happening?

    1. Pankaj says:

      is your xml very long?
      are you running on Eclipse or command line, try to write it on File and check if it’s writing full content or not.

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages