This project has moved. For the latest updates, please go here.

When HTML is stored in a DB table...

Mar 29, 2012 at 3:11 PM

Thank you to all who have contributed to this code set. This has really helped my project out and I wanted to contribute a few things I have found.

My product stores reports in HTML format as bytes in a database table.  The desire of my end users was to work on those reports in Word, which what brings me here.  I wanted to share how I used HTML to OpenXml in case others have the same need.

in C#, I first pull the contents of the HTML out of my database into a byte[].  Then I run it through the following:

using System.Text;

        public byte[] ConvertReport(byte[] content)

		//First, I pull a blank .docx I use as a template.
                byte[] template = File.ReadAllBytes(templatePath);
                using (MemoryStream generatedDocument = new MemoryStream())
		//I copy this template into the new generated document.
                    generatedDocument.Write(template, 0, (int)template.Length);
		//Then I use "HTML to OpenXml" to convert the HTML.
                    using (WordprocessingDocument package = WordprocessingDocument.Open(generatedDocument, true))
                        MainDocumentPart mainPart = package.MainDocumentPart;
                        if (mainPart == null)
                            mainPart = package.AddMainDocumentPart();
                            new Document(new Body()).Save(mainPart);

                        HtmlConverter converter = new HtmlConverter(mainPart);
                        converter.ImageProcessing = ImageProcessing.ManualProvisioning;
                        converter.ProvisionImage += converter_ProvisionImage;

                        Body body = mainPart.Document.Body;

                        var paragraphs = converter.Parse(Encoding.UTF8.GetString(content));
                        for (int i = 0; i < paragraphs.Count; i++)


		//I convert it back to a byte[], which is then used in my export to file code.
                    byte[] outputData = generatedDocument.ToArray();

                    return outputData;
            catch (Exception ex)
  //This is where your error trapping code would go.
I hope this helps anybody in a similar situation.