
Generate OfficeOpenXML documents in 5 minutes
Often it is necessary to generate a report on the server in the OpenXML format from an application on ASP.NET.
There are several common ways to do this:
OfficeOpenXML is where you save documents by default when working in Word and Excel: docx and xlsx. The file is a zip archive. You can rename it to zip, open it with the archiver and consider what is inside:

Reports in OOXML are well understood and edited by familiar means. I would not recommend limiting this format in serious applications, but I advise you to support it.
We will need:

We launch the Open XML SDK 2.0 Productivity Tool:

This tool is very simple and can do two small but important operations:
We load our document into the program and click “Reflect Code”:

On the left we see the structure of the document — the same files that are in the archive and the presentation of their contents.
Nodes in the tree can be selected: on the right you can see the contents of the node in the form of XML and code that can generate this particular piece. In my example, one paragraph is visible from the body of the document. It just lives in word / document.xml.
If you select the root of the tree (the document itself), we get the code for the entire document.
What is inside the generated class?
First, there is one single open method:
This is where the text that will be in the document is inserted:
As you can see from the names of private methods in the code, an OpenXml document consists of parts. A separate method has been made to generate each part.
The most curious, of course, smiling maliciously, inserted a picture into the document.
Pictures are stored directly in this file, in the form of base64, here:
Refactoring pictures and replacing static content with dynamic content will be left to the reader as an exercise.
And here is a method that generates not a file, but an array of bytes - for returning to a client from asp.net without temporary files:
Everything, the code for generating the report in docx format is ready.
It remains to replace the content with dynamic. But we did not do all this in order to always give the same thing, right? And add the link "Download in Word format" to the page.
So, we generated the code according to the document. They added a lot of data there, refactored it, implemented it in production. And now we need to change the font and text in the report. How to do this? There is a lot of code, to search in it for a long time.
It turns out that everything is very simple, the feature of comparing documents will help us:
By the way, this feature is still very convenient to use if you are just getting acquainted with the OpenXML format: add something to the document and see what has changed. It will help those who chose the "Kommersant" method, which was mentioned at the beginning of the article.
I believe that using DocumentFormat.OpenXml to generate reports in web applications is the right choice. The useful toolkit from the SDK allows you not to waste time in vain.
About OpenXML SDK: msdn.microsoft.com/en-us/library/bb448854(office.14).aspx
About OpenXML (if anyone is not familiar with it): en.wikipedia.org/wiki/Office_Open_XML
Good luck! Thanks for attention.
There are several common ways to do this:
- “Found, linked, zayuzal” - go to Google, look for a library to generate docx or xlsx, connect, understand, generate. This is familiar, but for a long time.
- "Fu" - use COM. This is not recommended, it requires Microsoft Office installed on the server, it is not very thread-safe, it is not friendly with x64 and is generally old-fashioned.
- "B" - to understand the format, assemble from XML and zip. Brutal.
- "Microsoft way" - this method is described under the cut.
Small introduction
OfficeOpenXML is where you save documents by default when working in Word and Excel: docx and xlsx. The file is a zip archive. You can rename it to zip, open it with the archiver and consider what is inside:

Reports in OOXML are well understood and edited by familiar means. I would not recommend limiting this format in serious applications, but I advise you to support it.
Training
We will need:
- Microsoft OpenXML SDK: www.microsoft.com/downloads/en/details.aspx?FamilyId=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en (download what more)
- Microsoft Word
- The simplest C # application in Visual Studio

Go
We launch the Open XML SDK 2.0 Productivity Tool:

This tool is very simple and can do two small but important operations:
- Generate code from a document
- Compare documents at the XML level
Code generation
We load our document into the program and click “Reflect Code”:

On the left we see the structure of the document — the same files that are in the archive and the presentation of their contents.
Nodes in the tree can be selected: on the right you can see the contents of the node in the form of XML and code that can generate this particular piece. In my example, one paragraph is visible from the body of the document. It just lives in word / document.xml.
If you select the root of the tree (the document itself), we get the code for the entire document.
Now let's use this code
- We make the project in Visual Studio. Let it be a simple console C # application
- Add reference to the assembly DocumentFormat.OpenXml:
I have it in the GAC. If you do not want to put it there, you can add a link to the file itself. You can download it separately in the same place where OpenXMLSDKTool was located, but using the link OpenXMLSDKv2.msi - Add reference to WindowsBase
- Add the file “GeneratedClass.cs”
- Copy the code from the toolbox from the ReflectedCode window
- Close the file, save it, go to Program.cs
- We write the Main method:
new GeneratedCode.GeneratedClass().CreatePackage(@"D:\Temp\Output.docx");
- We launch
What's inside?
What is inside the generated class?
First, there is one single open method:
public void CreatePackage(string filePath) {
using (WordprocessingDocument package = WordprocessingDocument.Create(filePath, WordprocessingDocumentType.Document)) {
CreateParts(package);
}
}
This is where the text that will be in the document is inserted:
private void GenerateMainDocumentPart1Content(MainDocumentPart mainDocumentPart1) {
Run run2 = new Run() { RsidRunProperties = "00184031" };
Text text2 = new Text();
text2.Text = "Исчисление предикатов, по определению, философски выводит структурализм, изменяя привычную реальность."; // о.О какую траву курил Яндекс?
}
As you can see from the names of private methods in the code, an OpenXml document consists of parts. A separate method has been made to generate each part.
The most curious, of course, smiling maliciously, inserted a picture into the document.
Pictures are stored directly in this file, in the form of base64, here:
#region Binary Data
//...
#endregion
Tie the bows
Refactoring pictures and replacing static content with dynamic content will be left to the reader as an exercise.
And here is a method that generates not a file, but an array of bytes - for returning to a client from asp.net without temporary files:
public byte[] CreatePackageAsBytes() {
using (var mstm = new MemoryStream()) {
using (WordprocessingDocument package = WordprocessingDocument.Create(mstm, WordprocessingDocumentType.Document)) {
CreateParts(package);
}
mstm.Flush();
mstm.Close();
return mstm.ToArray();
}
}
Everything, the code for generating the report in docx format is ready.
It remains to replace the content with dynamic. But we did not do all this in order to always give the same thing, right? And add the link "Download in Word format" to the page.
Document Comparison
So, we generated the code according to the document. They added a lot of data there, refactored it, implemented it in production. And now we need to change the font and text in the report. How to do this? There is a lot of code, to search in it for a long time.
It turns out that everything is very simple, the feature of comparing documents will help us:
- Put the old and new documents next
- Open the Open XML Productivity Tool, select "Compare files ...":
- Open the files and click OK.
Here is the result of the comparison: On the lines with the file names, you can poke and see what exactly the differences are:
In MoreOprions, choose what to ignore when comparing.
View Part Code shows the code of the part whose XML you see.
Already to compare XML and the labor code will not be.
By the way, this feature is still very convenient to use if you are just getting acquainted with the OpenXML format: add something to the document and see what has changed. It will help those who chose the "Kommersant" method, which was mentioned at the beginning of the article.
Facts
- With Xlsx rolls. Just like with docx
- If inside a Docx graph or chart, everything will be fine
- This is just a strongly-typed wrapper over the System.IO.Packaging library
- The server does not need anything except this library
- No problem with x64
- High performance
conclusions
I believe that using DocumentFormat.OpenXml to generate reports in web applications is the right choice. The useful toolkit from the SDK allows you not to waste time in vain.
What to read
About OpenXML SDK: msdn.microsoft.com/en-us/library/bb448854(office.14).aspx
About OpenXML (if anyone is not familiar with it): en.wikipedia.org/wiki/Office_Open_XML
Good luck! Thanks for attention.