Performance: LINQ to XML vs XmlDocument vs XmlReader on Desktop and Windows Phone

    Not so long ago I had to make an application for Windows Phone working with xml files. Everything was not bad, but when there were ~ 100,000 records in the file, reading them took a very long time. And I decided to compare the performance of different ways of reading data from xml possible on the .Net platform.

    Details under the cut.


    Equipment


    For a better understanding of the performance of the tests carried out, it is worth telling what they were done on. Tests from the category "Desktop" I performed on the home computer:
    • Processor: Pentium Dual-Core T4300 2100 Mhz
    • RAM: DDR2 2048Mb

    Tests on Windows Phone were performed on HTC 7 Mozart.

    Test preparation


    For testing, a simple xml file was used. IDs for each element were randomly generated, and the number of records varied depending on the test and amounted to: 1, 10, 100, 1,000, 100,000 pieces, respectively. The resulting file looked something like this:


     
     
     
     ........
     


    * This source code was highlighted with Source Code Highlighter.

    To reduce errors, each test was performed 100 times and the obtained data were averaged. And to simulate some actions on the record, the empty MethodId (id) method was called.

    XmlDocument.Load


    In my opinion, the implementation of reading data in this way is the simplest and most understandable. But, as we will see at the end, this is achieved at a very high cost (at the end of the article the implementation of this method without using XPath is given, but the results, personally for me, are not very different). The method code is as follows:
    private static void XmlDocumentReader(string filename)
    {
      var doc = new XmlDocument();
      doc.Load(filename);
      XmlNodeList nodes = doc.SelectNodes("//item");
      if (nodes == null)
        throw new ApplicationException("invalid data");

      foreach (XmlNode node in nodes)
      {
        string id = node.Attributes["id"].Value;
        ProcessId(id);
      }
    }

    * This source code was highlighted with Source Code Highlighter.


    LINQ to XML


    Using Linq-to-XML also leaves the implementation of the method quite simple and straightforward.
    private static void XDocumentReader(string filename)
    {
      XDocument doc = XDocument.Load(filename);
      if (doc == null || doc.Root == null)
        throw new ApplicationException("invalid data");

      foreach (XElement child in doc.Root.Elements("item"))
      {
        XAttribute attr = child.Attribute("id");
        if (attr == null)
          throw new ApplicationException("invalid data");

        string id = attr.Value;
        ProcessId(id);
      }
    }

    * This source code was highlighted with Source Code Highlighter.


    Xmlreader


    And finally, the last way to read data from XML is to use XmlTextReader. It is worth saying that this method is the most difficult to understand. In the process of reading the xml-file, you move it from top to bottom (without the possibility of moving in the opposite direction), and each time you need to check whether you need to extract the data? Accordingly, the method code looks like this:
    private static void XmlReaderReader(string filename)
    {
      using (var reader = new XmlTextReader(filename))
      {
        while (reader.Read())
        {
          if (reader.NodeType == XmlNodeType.Element)
          {
            if (reader.Name == "item")
            {
              reader.MoveToAttribute("id");
              string id = reader.Value;
              ProcessId(id);
            }
          }
        }
      }
    }

    * This source code was highlighted with Source Code Highlighter.

    * For simplicity, in the methods checks were omitted.

    Results for Desktop


    Below are the test results. To run each test, the time was measured separately and then averaged. The time in the table in milliseconds.
    1101001,00010,000100,000
    Xmldocument0.59 ms0.5 ms0.67 ms2.49 ms21.73 ms398.91 ms
    Xmlreader0.51 ms0.47 ms0.55 ms1.31 ms8.62 ms79.65 ms
    Linq to XML0.57 ms0.59 ms0.64 ms2.09 ms15.6 ms192.66 ms



    As you can see from the table, XmlReader, when reading large xml files, wins in performance Linq To XML by 2.42 times, and XmlDocument by more than 5 times!

    Testing on Windows Phone


    Now is the time to do the tests on the phone. It is worth noting that an older version of .Net Framework is installed on Windows Phone, so the method using XmlDocument.Load does not work, and the code for XmlReader had to be slightly rewritten:
    private static void XmlReaderReader(string filename)
    {
      using (var reader = XmlReader.Create(filename)) {
        while (reader.Read()) {
          if (reader.NodeType == XmlNodeType.Element) {
            if (reader.Name == "item") {
              reader.MoveToAttribute("id");
              string id = reader.Value;
              ProcessId(id);
            }
          }
        }
      }
    }

    * This source code was highlighted with Source Code Highlighter.


    Results for Windows Phone


    Predictably, XmlReader turned out to be faster on the phone. But unlike a desktop computer, the difference in performance on large files is different for them. On the XmlReader phone, it’s 1.91 times faster than LINQ to XML, and 2.42 times faster on the desktop.
    1101001,00010,000100,000
    Xmlreader1.67 ms1.74 ms3.19 ms19.5 ms173.84 ms1736.18 ms
    Linq to XML1.73 ms2.21 ms4.75 ms31.39 ms314.39 ms3315.13 ms


    The difference in the speed of reading 100 elements from a file on Desktop and Windows Phone.


    The difference in the speed of reading 100,000 items from a file on Desktop and Windows Phone.

    As you can see, the speed of reading data on the phone and desktop computer, depending on the amount of data, varies non-linearly. It is interesting to know why this is so.

    Conclusion


    As we explained, the most productive way to read data from xml is to use XmlReader, regardless of the platform. But the inconvenience of its use is a rather complicated way of fetching data - each time we have to check which element the pointer is on.

    If performance is not a cornerstone for you, and most importantly, clarity and simplicity of code maintenance, then LINQ to XML is the most suitable. You should also try to avoid using XmlDocument.Load in work projects because of its poor performance.

    PS It is worth mentioning that this article inspired me to write all this .

    Update: at the suggestion of alex_rusmade a test for XmlDocument without using XPath. The results were better, but still this method remained the slowest.

    Table No. 3. Comparison of XmlDocument performance with and without XPath.
    1101001,00010,000100,000
    XmlDocument (c XPath)0.59 ms0.5 ms0.67 ms2.49 ms21.73 ms398.91 ms
    XmlDocument (without XPath)0.56 ms0.5 ms0.65 ms2.24 ms19.47 ms362.75 ms


    As can be seen from the table (and figure), productivity increased by only 10%. Although there were suggestions that this value will be much higher.

    Actually, the code for XmlDocument without XPath is below. I hope that knowledgeable people will show where I have errors, as a result of which the processing speed increased by only 10%, and not "at times".
    private static void XmlDocumentReader2(string filename)
    {
      var doc = new XmlDocument();
      doc.Load(filename);

      XmlElement root = doc.DocumentElement;
      foreach (XmlElement el in root.ChildNodes)
      {
        if (el.Name != "item") continue;

        string id = el.Attributes["id"].Value;
        ProcessId(id);
      }
    }

    * This source code was highlighted with Source Code Highlighter.

    Also popular now: