Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up exporter to be in better shape for a release #14

Merged
merged 6 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 25 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,30 @@
# Astronomy Notebooks for All
Jupyter Notebooks play a central role in modern data science workflows. Despite their importance, these notebooks are inaccessible to people with disabilities, especially those who rely on assistive technology. Impacted users must find extreme workarounds or [give up using them entirely.](https://www.freelists.org/post/program-l/Accessability-of-Jupyter-notebooks) Students with disabilities have [reported leaving their field](https:/jupyterlab/jupyterlab/issues/9399#issuecomment-740524422) once they learn their chosen career’s foundational tools are inaccessible to them.
# `nbconvert-a11y`

This is a challenging problem to solve. The Notebooks for All project is taking the first steps, initially focusing on static notebooks:
- Running usability feedback sessions with impacted users who rely on a variety of assistive technology
- Capturing what makes notebooks inaccessible with assistive technology, and compiling documents that describe the issues and feedback
- Editing notebooks based on the feedback
- Organizing events to spread awareness in the scientific community about this issue
`nbconvert-a11y` contains templates for accessible notebook representations and accessibility tests for Jupyter notebook products.

## Collaborators
[Space Telescope Science Institute](https://www.stsci.edu/) produces extensive community resources and infrastructure in Jupyter. The Institute has committed to fostering an inclusive environment and has funded this project in 2022-2023 as part of the Director’s Discretionary Fund. Other collaborators include community contributions and work from STEM- and accessibility-focused organizations such as [Iota School](https://iotaschool.com/) and [Quansight Labs](https://www.quansight.com/labs).
```bash
pip install nbconvert-a11y
```

## Resources
[A Curated List of STScI notebooks](https:/spacetelescope/notebooks)
[Accessibility Analysis of Jupyter Notebook HTML Output](https://www.youtube.com/watch?v=KsUF_HjA97U&t=253s)
[Astronomy Notebooks For All full proposal](resources/proposal-astronomy-notebooks-for-all.md)
`nbconvert-a11y` can be used with the [`nbconvert` command line tool](https://nbconvert.readthedocs.io/en/latest/usage.html).
it provides the `a11y` exporter with several variants that can be used. the default theme uses a flexible table representation

```bash
jupyter nbconvert --to a11y Untitled.ipynb # flexible table navigation
jupyter nbconvert --to a11y-table Untitled.ipynb # a11y is an alias for a11y-table
jupyter nbconvert --to a11y-landmark Untitled.ipynb # cells are section landmarks
jupyter nbconvert --to a11y-list Untitled.ipynb # cells are list items
```

```python
from nbconvert_a11y.exporter import A11y, Table, Section, List
```

## History

the `nbconvert-a11y` project is forked from initial development in the [`notebook-for-all`]() repository.
this collaboration between [Space Telescope Science Institute](https://www.stsci.edu/), [Iota School](https://iotaschool.com/) and [Quansight Labs](https://www.quansight.com/labs)
brought input from blind and visual impaired notebook users as to what their most assistive experiences could be.

## License
This repository hosts mixed content types. Suitable licenses apply to each type. All of the repository except the `[user-tests](user-tests)` directory are under a [3-Clause BSD license](LICENSE). All content in the `[user-tests](user-tests)` directory is under a [CC-BY license](https://creativecommons.org/licenses/by/4.0/).
Licensed e under a [3-Clause BSD license](LICENSE).
224 changes: 135 additions & 89 deletions nbconvert_a11y/a11y_exporter.py → nbconvert_a11y/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,20 @@
"""

import builtins
from io import StringIO
import json
from contextlib import suppress
from datetime import datetime
from functools import lru_cache
from pathlib import Path

import bs4
from nbconvert import Exporter
import nbformat.v4
import pygments
from bs4 import BeautifulSoup
from nbconvert.exporters.html import HTMLExporter
from traitlets import Bool, CUnicode, Enum, Unicode
from traitlets.config import Config

singleton = lru_cache(1)

Expand All @@ -28,59 +29,6 @@
SCHEMA = nbformat.validator._get_schema_json(nbformat.v4)


def strip_comments(tag):
for child in getattr(tag, "children", ()):
with suppress(AttributeError):
if isinstance(child, bs4.Comment):
child.extract()
strip_comments(child)
return tag


@lru_cache
def get_markdown_renderer():
from markdown_it import MarkdownIt
from mdit_py_plugins.anchors import anchors_plugin

md = MarkdownIt("gfm-like", options_update={"inline_definitions": True, "langPrefix": ""})
md.use(anchors_plugin)
md.options.update(highlight=highlight)
return md


def get_markdown(md, **kwargs):
return get_markdown_renderer().render("".join(md), **kwargs)


def highlight(code, lang="python", attrs=None, experimental=True):
import html

import pygments

if lang == "code":
lang = "python"
elif lang == "raw":
return ""

lang = lang or pygments.lexers.get_lexer_by_name(lang or "python")

formatter = pygments.formatters.get_formatter_by_name(
"html", debug_token_types=True, title=f"{lang} code", wrapcode=True
)
try:
return pygments.highlight(
code, pygments.lexers.get_lexer_by_name(lang or "python"), formatter
)
except BaseException as e:
print(code, e)

return f"""<pre><code>{html.escape(code)}</code></pre>"""


def get_soup(x):
return bs4.BeautifulSoup(x, features="html5lib")


THEMES = {
"a11y": "a11y-{}",
"a11y-high-contrast": "a11y-high-contrast-{}",
Expand All @@ -92,31 +40,55 @@ def get_soup(x):
}


class FormExporter(HTMLExporter):
"""an embellished HTMLExporter that allows modifications of exporting and the exported.
class PostProcess(Exporter):
"""an exporter that allows post processing after the templating step

this class introduces the `post_process_html` protocol that can be used to modify
exported html.
"""

the `nbconvert` exporter has a lot machinery for converting notebook data into strings.
this class introduces a `post_process` trait that allows modifications after creating html content.
this method allows tools like `html.parser` and `bs4.BeautifulSoup` to make modifications at the end.
def from_notebook_node(self, nb, resources=None, **kw):
html, resources = super().from_notebook_node(nb, resources, **kw)
html = self.post_process_html(html) or html
return html, resources

changes to the template and exporter machinery are foundational changes that take time.
post modifications make it possible to quick changes in manual testing scenarios or configure
def post_process_code_cell(self, cell):
pass
A/B testing with out requiring `nbconvert` or notebook knowleldge.
def post_process_html(self, body):
...


class A11yExporter(PostProcess, HTMLExporter):
"""an accessible reference implementation for computational notebooks implemented for ipynb files.

this template provides a flexible screen reader experience with settings to control and customize the reading experience.
"""

template_file = Unicode("semantic-forms/table.html.j2").tag(config=True)
include_axe = Bool(False).tag(config=True)
axe_url = CUnicode(AXE).tag(config=True)
include_settings = Bool(True).tag(config=True)
include_help = Bool(True).tag(config=True)
include_toc = Bool(True).tag(config=True)
wcag_priority = Enum(["AAA", "AA", "A"], "AA").tag(config=True)
accesskey_navigation = Bool(True).tag(config=True)
include_cell_index = Bool(True).tag(config=True)
template_file = Unicode("a11y/table.html.j2").tag(config=True)
include_axe = Bool(False, help="include axe auditing tools in the rendered page.").tag(
config=True
)
axe_url = CUnicode(AXE, help="the remote source for the axe resources.").tag(config=True)
include_settings = Bool(False, help="include configurable accessibility settings dialog.").tag(
config=True
)
include_help = Bool(
False, help="include help and supplementary descriptions about notebooks and cells"
).tag(config=True)
include_toc = Bool(
True, help="collect a table of contents of the headings in the document"
).tag(config=True)
wcag_priority = Enum(
["AAA", "AA", "A"], "AA", help="the default inital wcag priority to start with"
).tag(config=True)
accesskey_navigation = Bool(
True, help="use numeric accesskeys to access the first 10 cells"
).tag(config=True)
include_cell_index = Bool(
True, help="show the ordinal cell index, typically this is ignored from notebooks."
).tag(config=True)
exclude_anchor_links = Bool(True).tag(config=True)
code_theme = Enum(list(THEMES), "gh-high").tag(config=True)
code_theme = Enum(list(THEMES), "gh-high", help="an accessible pygments dark/light theme").tag(
config=True
)

def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
Expand All @@ -140,6 +112,16 @@ def __init__(self, *args, **kwargs) -> None:
datetime=datetime,
)

@property
def default_config(self):
c = super().default_config
c.merge(
{
"CSSHTMLHeaderPreprocessor": {"enabled": False},
}
)
return c

def from_notebook_node(self, nb, resources=None, **kw):
resources = resources or {}
resources["include_axe"] = self.include_axe
Expand All @@ -151,11 +133,10 @@ def from_notebook_node(self, nb, resources=None, **kw):
resources["code_theme"] = THEMES[self.code_theme]
resources["axe_url"] = self.axe_url

html, resources = super().from_notebook_node(nb, resources, **kw)
html = self.post_process_html(html)
return html, resources
return super().from_notebook_node(nb, resources, **kw)

def post_process_html(self, body):
"""a final pass at the exported html to add table of contents, heading links, and other a11y affordances."""
soup = soupify(body)
describe_main(soup)
heading_links(soup)
Expand All @@ -167,19 +148,64 @@ def post_process_html(self, body):
details.select_one("ol").attrs["aria-labelledby"] = "nb-toc"
return soup.prettify(formatter="html5")

@property
def default_config(self):
c = super().default_config
c.merge(
{
"CSSHTMLHeaderPreprocessor": {"enabled": False},
}
)
return c

class SectionExporter(A11yExporter):
template_file = Unicode("a11y/section.html.j2").tag(config=True)

class A11yExporter(FormExporter):
template_file = Unicode("a11y/table.html.j2").tag(config=True)

class ListExporter(A11yExporter):
template_file = Unicode("a11y/list.html.j2").tag(config=True)


def strip_comments(tag):
for child in getattr(tag, "children", ()):
with suppress(AttributeError):
if isinstance(child, bs4.Comment):
child.extract()
strip_comments(child)
return tag


@lru_cache
def get_markdown_renderer():
from markdown_it import MarkdownIt
from mdit_py_plugins.anchors import anchors_plugin

md = MarkdownIt("gfm-like", options_update={"inline_definitions": True, "langPrefix": ""})
md.use(anchors_plugin)
md.options.update(highlight=highlight)
return md


def get_markdown(md, **kwargs):
"""exporter markdown as html"""
return get_markdown_renderer().render("".join(md), **kwargs)


def highlight(code, lang="python", attrs=None, experimental=True):
"""highlight code blocks"""
import html

import pygments

if lang == "code":
lang = "python"
elif lang == "raw":
return ""

lang = lang or pygments.lexers.get_lexer_by_name(lang or "python")

formatter = pygments.formatters.get_formatter_by_name(
"html", debug_token_types=True, title=f"{lang} code", wrapcode=True
)
try:
return pygments.highlight(
code, pygments.lexers.get_lexer_by_name(lang or "python"), formatter
)
except BaseException as e:
print(code, e)

return f"""<pre><code>{html.escape(code)}</code></pre>"""


def soupify(body: str) -> BeautifulSoup:
Expand All @@ -188,6 +214,7 @@ def soupify(body: str) -> BeautifulSoup:


def mdtoc(html):
"""create a table of contents in markdown that will be converted to html"""
import io

toc = io.StringIO()
Expand All @@ -207,10 +234,12 @@ def mdtoc(html):


def toc(html):
"""create an html table of contents"""
return get_markdown(mdtoc(html))


def heading_links(html):
"""convert headings into links"""
for header in html.select(":is(h1,h2,h3,h4,h5,h6):not([role])"):
id = header.attrs.get("id")
if not id:
Expand All @@ -229,19 +258,33 @@ def heading_links(html):
# * navigate landmarks


def count_cell_loc(cell):
lines = 0
for line in StringIO("".join(cell.source)):
if not line:
continue
if line.strip():
lines += 1
return lines


def count_loc(nb):
return sum(map(len, (x.source.splitlines() for x in nb.cells)))
"""count total significant lines of code in the document"""
return sum(map(count_cell_loc, nb.cells))


def count_outputs(nb):
"""count total number of cell outputs"""
return sum(map(len, (x.get("outputs", "") for x in nb.cells)))


def count_code_cells(nb):
"""count total number of code cells"""
return len([None for x in nb.cells if x["cell_type"] == "code"])


def describe_main(soup):
"""add REFIDs to aria-describedby"""
x = soup.select_one("#toc > details > summary")
if x:
x.attrs["aria-describedby"] = soup.select_one("main").attrs[
Expand All @@ -250,9 +293,12 @@ def describe_main(soup):


def ordered(nb) -> str:
"""measure if the notebook is ordered"""
start = 0
for cell in nb.cells:
if cell["cell_type"] == "code":
if any("".join(cell.source).strip()):
continue
start += 1
if start != cell["execution_count"] and start:
return "executed out of order"
Expand Down
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@ build = "mkdocs build -v"
serve = "mkdocs serve -v"

[project.entry-points."nbconvert.exporters"]
a11y = "nbconvert_a11y.a11y_exporter:A11yExporter"
a11y = "nbconvert_a11y.exporter:A11yExporter"
a11y-table = "nbconvert_a11y.exporter:A11yExporter"
a11y-landmark = "nbconvert_a11y.exporter:SectionExporter"
a11y-list = "nbconvert_a11y.exporter:ListExporter"

[project.entry-points.pytest11]
axe = "nbconvert_a11y.pytest_axe"
Expand Down
2 changes: 2 additions & 0 deletions tests/configurations/a11y.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@

c.NbConvertApp.export_format = "a11y"
c.A11yExporter.include_axe = True
c.A11yExporter.include_settings = True
c.A11yExporter.include_help = True
c.A11yExporter.wcag_priority = "AAA"
Loading
Loading