This project has moved. For the latest updates, please go here.

Trouble with tables with complex spans

Apr 4, 2016 at 11:35 PM
Hi there--

First of all, thank you for your work on this! It's come in very handy.

However, we're having some trouble with some tables with complex rowspans, such as
<html>
    <body>
        <table border="1">
            <tr>
                <td rowspan="3"><p>a1</p></td>
                <td><p>a2</p></td>
                <td rowspan="3"><p>a3</p></td>
                <td><p>a4</p></td>
            </tr>
            <tr>
                <td rowspan="2"><p>b1</p></td>
                <td><p>b2</p></td>
            </tr>
            <tr>
                <td><p>c1</p></td>
            </tr>
        </table>
    </body>
</html>
In a browser, this looks like

Image

After parsing this, the table in the generated document looks like

Image

This is with both the 1.6 release and the most recent snapshot (commit 90904). I think the issue is with where/in which columns the empty cells end up being inserted in HtmlConverter.ProcessClosingTableRow. Was wondering if an alternative approach might work where we first collect the column indices that should end up empty (taking colspans into account), and then insert empty cells into the current row at those indices from left to right?

Thanks,
Sam