Skip to content

Commit

Permalink
Better documentation & errors when facing HTML rendering limitations …
Browse files Browse the repository at this point in the history
…for `<table>` tags - close #845 (#852)
  • Loading branch information
Lucas-C authored Jul 11, 2023
1 parent faca4c0 commit 40e1b7a
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
- [`FPDF.table()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.table): new optional parameters `gutter_height`, `gutter_width` and `wrapmode`. Links can also be added to cells by passing a `link` parameter to [`Row.cell()`](https://pyfpdf.github.io/fpdf2/fpdf/table.html#fpdf.table.Row.cell)
- [`FPDF.multi_cell()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.multi_cell): has a new optional `center` parameter to position the cell horizontally at the center of the page
- Added Tutorial in Khmer language - thanks to @kuth-chi
- Better documentation & errors when facing HTML rendering limitations for `<table>` tags: <https://pyfpdf.github.io/fpdf2/HTML.html>
### Fixed
- [`FPDF.table()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.table): the `colspan` setting has been fixed - [documentation](https://pyfpdf.github.io/fpdf2/Tables.html#column-span)
- [`FPDF.image()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.image): allowing images path starting with `data` to be passed as input
Expand Down
17 changes: 14 additions & 3 deletions docs/HTML.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# HTML #
# HTML

`fpdf2` supports basic rendering from HTML.

Expand All @@ -12,9 +12,9 @@ you may want to check [Reportlab](https://www.reportlab.com) (or [xhtml2pdf](htt
or [borb](https://github.com/jorisschellekens/borb-examples/#76-exporting-html-as-pdf).


## write_html usage example ##
## write_html usage example

HTML rendering require the use of `write_html` method:
HTML rendering requires the use of [`FPDF.write_html()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html):

```python
from fpdf import FPDF
Expand Down Expand Up @@ -91,3 +91,14 @@ pdf.output("html.pdf")
+ `<tr>`: rows (with `align`, `bgcolor` attributes)
+ `<th>`: heading cells (with `align`, `bgcolor`, `width` attributes)
* `<td>`: cells (with `align`, `bgcolor`, `width` attributes)


## Known limitations

`fpdf2` HTML renderer does not support many configuration of nested tags.
For example:
* `<center>` cannot be used as a parent for several elements - _cf._ [issue #640](https://github.com/PyFPDF/fpdf2/issues/640)
* `<table>` cells can contain `<td><b><em>nested tags forming a single text block</em></b></td>`, but **not** `<td><b>arbitrarily</b> nested <em>tags</em></td>` - _cf._ [issue #845](https://github.com/PyFPDF/fpdf2/issues/845)

You can also check the currently open GitHub issues with the tag `html`:
https://github.com/PyFPDF/fpdf2/issues?q=is%3Aopen+label%3Ahtml
12 changes: 12 additions & 0 deletions fpdf/fpdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,18 @@ def write_html(self, text, *args, **kwargs):
"""
Parse HTML and convert it to PDF.
cf. https://pyfpdf.github.io/fpdf2/HTML.html
Args:
text (str): HTML content to render
image_map (function): an optional one-argument function that map <img> "src"
to new image URLs
li_tag_indent (int): numeric indentation of <li> elements
dd_tag_indent (int): numeric indentation of <dd> elements
table_line_separators (bool): enable horizontal line separators in <table>
ul_bullet_char (str): bullet character for <ul> elements
heading_sizes (dict): font size per heading level names ("h1", "h2"...)
pre_code_font (str): font to use for <pre> & <code> blocks
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags
"""
kwargs2 = vars(self)
# Method arguments must override class & instance attributes:
Expand Down
15 changes: 15 additions & 0 deletions fpdf/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,9 @@ def __init__(
dd_tag_indent (int): numeric indentation of <dd> elements
table_line_separators (bool): enable horizontal line separators in <table>
ul_bullet_char (str): bullet character for <ul> elements
heading_sizes (dict): font size per heading level names ("h1", "h2"...)
pre_code_font (str): font to use for <pre> & <code> blocks
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags
"""
super().__init__()
self.pdf = pdf
Expand Down Expand Up @@ -262,6 +265,17 @@ def handle_data(self, data):
data = data.strip()
if not data:
return
if "inserted" in self.td_th:
tag = self.td_th["tag"]
raise NotImplementedError(
f"Unsupported nested HTML tags inside <{tag}> element"
)
# We could potentially support nested <b> / <em> / <font> tags
# by building a list of Fragment instances from the HTML cell content
# and then passing those fragments to Row.cell().
# However there should be an incoming refactoring of this code
# dedicated to text layout, and we should probably wait for that
# before supporting this feature.
align = self.td_th.get("align", self.tr.get("align"))
if align:
align = align.upper()
Expand Down Expand Up @@ -454,6 +468,7 @@ def handle_starttag(self, tag, attrs):
if not self.table_row:
raise FPDFException(f"Invalid HTML: <{tag}> used outside any <tr>")
self.td_th = {k.lower(): v for k, v in attrs.items()}
self.td_th["tag"] = tag
if tag == "th":
self.td_th["align"] = "CENTER"
self.td_th["b"] = True
Expand Down
16 changes: 16 additions & 0 deletions test/html/test_html_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,3 +278,19 @@ def test_html_table_invalid(caplog):
pdf.write_html("<tr></tr>")
assert str(error.value) == "Invalid HTML: <tr> used outside any <table>"
assert caplog.text == ""


def test_html_table_with_nested_tags(): # issue 845
pdf = FPDF()
pdf.set_font_size(24)
pdf.add_page()
with pytest.raises(NotImplementedError):
pdf.write_html(
"""<table><tr>
<th>LEFT</th>
<th>RIGHT</th>
</tr><tr>
<td><font size=7>This is supported</font></td>
<td>This <font size=20>is not</font> <b>supported</b></td>
</tr></table>"""
)

0 comments on commit 40e1b7a

Please sign in to comment.