This project has moved. For the latest updates, please go here.

Multiple nested <table> invalid OpenXML format

Sep 15, 2012 at 10:21 PM
Edited Sep 15, 2012 at 10:26 PM

I have a HTML document that has multiple nested tables in the body similiar to the following:

        <table align="center">
            <tbody>
                <tr>
                    <td bgcolor="white"><font size="2"><b><u>Test</u></b></font><font size="3"
                        face="Times New Roman">
</font>
                        <p>
                            <br>
                            <table>
                                <tbody>
                                    <tr valign="top">
                                        <td>
                                            <table width="100%">
                                                <tbody>
                                                    <tr valign="top">
                                                        <td width="100%"><font size="3" face="Times New Roman">Test<br>
<span style="WHITE-SPACE: nowrap" class="baec5a81-e4d6-4674-97f3-e9220f0136c1">Test<a style="BORDER-BOTTOM: medium none; POSITION: static !important; BORDER-LEFT: medium none; MARGIN: 0px; WIDTH: 16px; BOTTOM: 0px; DISPLAY: inline; WHITE-SPACE: nowrap; FLOAT: none; HEIGHT: 16px; VERTICAL-ALIGN: middle; OVERFLOW: hidden; BORDER-TOP: medium none; CURSOR: hand; RIGHT: 0px; BORDER-RIGHT: medium none; TOP: 0px; LEFT: 0px" title="Test" href="#"><img style="BORDER-BOTTOM: medium none; POSITION: static !important; BORDER-LEFT: medium none; MARGIN: 0px; WIDTH: 16px; BOTTOM: 0px; DISPLAY: inline; WHITE-SPACE: nowrap; FLOAT: none; HEIGHT: 16px; VERTICAL-ALIGN: middle; OVERFLOW: hidden; BORDER-TOP: medium none; CURSOR: hand; RIGHT: 0px; BORDER-RIGHT: medium none; TOP: 0px; LEFT: 0px" title="Call: (201) 451-5200" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIGNIUk0AAHolAACAgwAA&#43;f8AAIDpAAB1MAAA6mAAADqYAAAXb5JfxUYAAAKLSURBVHjadJPfS5NhFMe/21xvuhXRyJAZroiSrJnbRdT7vrAf5HBaK5RABmEEwQIvkpZ/QRcWXdSFw5soKaF0F7qZeLO13mGBDpQsf5CoxVKHOt0Pctp2uvEdrzG/V&#43;c553w/54HnPDIiQiGpPMETABoB2AAYd9MRAMMAvGmX&#43;RcAyAoBVJ7gZQDtABworH4AHWmX&#43;bOMZdkjCoXiUzabvcAwzPSsob5p/VTNY9GcdpnxdmYZ9wJThSCtCr1e/4XjuNPd3d1KjUZzaGbI27ysqzGQoggAsLa1A7ehArrDxfDNr0oBlQB&#43;wmKxbJFEL968SxoamsjkHaPU9l9piUo6A0RE1DG2QCWdASrpDAzJM5kMI8XecdjVxfEl&#43;K9dxFgsgUvvR6HyBKHyBAEATyKLeGSsENuNcqk5kUjEGm7fzcYqr0ClVODl99&#43;YXEvl6&#43;c1amjVe&#43;ahiGGYaUEQKnmeh91uL43rqheixjpdmzCL11er0PcjhrTLvMfUJsyKYUSeyWQ6enp6tgCgrKxsfbP8bB8AdE1G89cOReMAgOv&#43;Cag8QXRNRkXAsDwcDr&#43;am5tLCYKA3t7eo2dG&#43;1vVK/MfpRPtA&#43;MIReMYaKj&#43;/xm9MiICx3EmpVL5wefzFavValis1u1vvHMkdfykCQC0kSGUTo&#43;Ajmnx1dSC7IGD&#43;UUCEYGIwLKsyWazrSeTSSIiMpnNf7Ttz5&#43;ec96fr7/VnE0mk&#43;QfHMzV3WjcKH/4rEr05QGFIA6HY4llWRLPRER&#43;v3/HYrFMFQSIkNra2tVQKJSlfcSyLO0LECFWq3XF6XRGA4HAptTsdrsXeZ6fEHtl&#43;31nAOA4rkUulz/I5XL63dQGgHEAN8Ph8AYA/BsAt4ube4GblQIAAAAASUVORK5CYII="></a></span></font>
                                                        </td>
                                                    </tr>
                                                </tbody>
                                            </table>
                                            <br>
                                        </td>
                                    </tr>
                                </tbody>
                            </table>
                            <br>
                        </p>
                    </td>
                </tr>
            </tbody>
        </table>

After converting to OpenXML I am unable to open document and get a "Possible missing paragraph element. <p> elements are required before every </tc>" error and am unable to open the document. I have narrowed this down to the OpenXML not having an empty paragraph after the innermost <w:tbl> element and before the containing table's <w:tc> element. I have fixed this locally with a modification to the ProcessClosingTable method of the HtmlConverter class. I removed the if condition from the last line of the method:

if (!tables.HasContext)
    this.AddParagraph(currentParagraph = htmlStyles.Paragraph.NewParagraph());

As with all of my postings here, please let me know if this is more of an issue than a solution since I don't have a large understanding for the underpinnings of OpenXML.