Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Annotated HTML tags are not escaped in displaCy span renderer output #12816

Closed
connorbrinton opened this issue Jul 11, 2023 · 2 comments · Fixed by #12817
Closed

🐛 Annotated HTML tags are not escaped in displaCy span renderer output #12816

connorbrinton opened this issue Jul 11, 2023 · 2 comments · Fixed by #12817
Labels
bug Bugs and behaviour differing from documentation feat / visualizers Feature: Built-in displaCy and other visualizers

Comments

@connorbrinton
Copy link
Contributor

connorbrinton commented Jul 11, 2023

How to reproduce the behaviour

Here's a test that currently fails:

def test_span_escaping(en_vocab) -> None:
    """Test that displaCy's span visualizer escapes annotated HTML tags correctly."""
    # Create a doc containing an annotated word and an unannotated HTML tag
    doc = Doc(en_vocab, words=["test", "<TEST>"])
    doc.spans["sc"] = [Span(doc, 0, 1, label="test")]

    # Verify that the HTML tag is escaped when unannotated
    html = displacy.render(doc, style="span")
    assert "&lt;TEST&gt;" in html

    # Annotate the HTML tag
    doc.spans["sc"].append(Span(doc, 1, 2, label="test"))

    # Verify that the HTML tag is still escaped
    html = displacy.render(doc, style="span")
    assert "&lt;TEST&gt;" in html

The test currently fails on the last line, since the annotated HTML tag is not escaped by the displaCy renderer. Adding a call to escape_html here fixes the issue:

text=token["text"],

I ran into this issue when trying to visualize some annotated code documents with <span> in some of the documents. This resulted in the documents and span underlines rendering on top of each other at the beginning of the visualization. Adding escape_html as described above fixed the rendering issues.

I have a PR ready to fix this that I'll post as soon as I update the test <-> issue links in the code! Posted! 😄

Your Environment

  • spaCy version: 3.6.0
  • Platform: macOS-13.4.1-arm64-arm-64bit
  • Python version: 3.11.2
@svlandeg svlandeg added bug Bugs and behaviour differing from documentation feat / visualizers Feature: Built-in displaCy and other visualizers labels Jul 13, 2023
@svlandeg
Copy link
Member

PR merged - thanks again!

@github-actions
Copy link
Contributor

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / visualizers Feature: Built-in displaCy and other visualizers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants