Project DescriptionA library to convert simple or advanced html to plain OpenXml document.
Supported Html tags
Refer to
w3schools’ tag list to see their meaning
- <a>
- <h1-h6>
- <abbr> and <acronym>
- <b>, <i>, <u>, <s>, <del>, <ins>, <em>, <strike>, <strong>
- <br> and <hr>
- <img>
- <table>, <td>, <tr>, <th>, <tbody>, <thead> and <caption>
- <cite>
- <div>, <span>, <font> and <p>
- <pre>
- <sub> and <sup>
- <ul>, <ol> and <li>
- <dd> and <dt>
- <q> and <blockquote> (since 1.5)
Javascript (<script>), CSS <style>, <meta> and other not supported tags does not generate an error but are
ignored.
Tolerance for bad formed HTML
The parsing of the Html is done using a custom Regex-based enumerator. These are supported:
| | Samples |
| Ignore case | <span>Some text<SPAN> |
| Missing closing tag or invalid tag position | <i>Here<b> is </i> some</b> bad formed html. |
| no need to be XHTML compliant | Both <br> and <br/> are valid |
| Color | red, #ff0000 and ff0000 are all the red color |
| Attributes | <table id=table1> or <table id="table1"> |
Dependencies
Use the
OpenXml SDK 2.0
Documentation
Don't forget to visit the
documentation and drop me your feedback !