Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs blocks (#810) #888

Merged
merged 8 commits into from
Aug 2, 2018
Merged

Docs blocks (#810) #888

merged 8 commits into from
Aug 2, 2018

Conversation

beckjake
Copy link
Contributor

Add support for docs blocks and support for a doc() Jinja macro that references them.

@beckjake beckjake requested a review from drewbanin July 31, 2018 19:57
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love this. Minimal comments after my first pass through the code -- everything in here looks super good. I'll have some more questions I think after playing around with this in concert with schema yml v2

def find_docs_by_name(self, name, package=None):
for unique_id, doc in self.docs.items():
parts = unique_id.split('.')
if len(parts) != 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am i crazy or should parts have 3 items here? Something like docs.project_name.docs_name?

class DocumentationParser(BaseParser):
@classmethod
def load_file(cls, package_name, root_dir, relative_dirs):
"""Load and parse documentation in a lsit of projects. Returns a list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lil typo here


# because docs are in their own graph namespace, node type doesn't
# need to be part of the unique ID.
unique_id = '{}.{}'.format(docfile.package_name, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! I see. Is there any benefit to keeping the format of these unique ids consistent with the rest of dbt's unique ids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. In the nodes and macros dicts, there are potentially multiple node types so the .-separated namespacing by type makes a lot of sense. In the docs dict (and, I hope, all future manifest members) there's only one type. I would have removed resource_type from the docs entirely if I could have. Docs can only be referenced via doc() and created via {% docs ... %}, and nothing else can be referenced that way, so you never have any ambiguity (like you might with seeds vs models and ref).

That said, if there are strong feelings about consistency, it's a pretty easy change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I buy that

return manifest.find_docs_by_name(target_doc_name,
target_doc_package)

candidate_targets = [current_project, node_package, None]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is phenomenal

dbt/loader.py Outdated
root_project=root_project,
all_projects=all_projects,
root_dir=project.get('project-root'),
relative_dirs=project.get('source-paths', []))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beckjake this code looks for the docs files in the source-paths directory. I think that's a pretty reasonable default. Do you think there's any merit to making an (optional) docs-paths dir as well? I can imagine this would be useful for docs that don't pertain to a specific model, eg. docs for widely-used columns, or the "overview" markdown block, for instance.

cc @cmcarthur

Copy link
Contributor Author

@beckjake beckjake Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I didn't even consider that. I just immediately put it next to the schema.yml. But it does make some sense.

If we did so maybe we should consider something like: relative_dirs=project.get('source-paths', []) + project.get('docs-paths', [])

I actually also wouldn't be against also searching from the root directory, if that's feasible with dbt (maybe relative_dirs + ['.'] would work?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beckjake I was thinking the same thing around searching the root directory! I'm a little concerned thought that a random .md file could contain code that would break dbt.

Just as an example, a styleguide.md file (like this might contain {{ ref(...) }} in example sql, and I think the docs parser would probably choke on that, right?

After some testing, we do a recursive search through the dirs, so including . in the list of paths will also snag the target/ directory, which is probably not what we want here.

I think the right way to handle this is to just use docs-paths, where the default value is:

"docs-paths": ["models"]

This will let users override the dirs in their dbt_project.yml file, but will still be sort of a sane default value. You buy that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree, but I think we should default docs-paths to the contents of source-paths, rather than ["models"], if we can. It's a subtle difference, but the idea of "by default we search your source paths" seems really compelling to me.

Copy link
Contributor

@drewbanin drewbanin Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, I buy that. Let's make sure to document that functionality in the docs for the docs :)

@@ -403,11 +407,16 @@ def parse_model(cls, model, package_name, root_dir, path, root_project,
all_projects, macros)
yield 'test', node

context = {'doc': dbt.context.parser.docs(model, docrefs)}
description = model.get('description')
Copy link
Contributor

@drewbanin drewbanin Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if the model doesn't have a description, we pass None into a ParsedNodePatch below, resulting in a validation error. Can we use an empty string default here?

  File "/Users/drew/fishtown/dbt/dbt/parser/schemas.py", line 448, in load_and_parse
    for result_type, node in v2_results:
  File "/Users/drew/fishtown/dbt/dbt/parser/schemas.py", line 364, in parse_v2_yml
    for node_type, node in iterator:
  File "/Users/drew/fishtown/dbt/dbt/parser/schemas.py", line 416, in parse_model
    patch = ParsedNodePatch(
  File "/Users/drew/fishtown/dbt/dbt/api/object.py", line 37, in __init__
    self.validate()
  File "/Users/drew/fishtown/dbt/dbt/api/object.py", line 85, in validate
    raise ValidationException(msg)
dbt.exceptions.ValidationException: Runtime Error
  Invalid arguments passed to "ParsedNodePatch" instance: description.None is not of type 'string'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes - I thought I got all of these, I'll see if I left any more.

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@beckjake beckjake merged commit 4a9e3ee into dev/isaac-asimov Aug 2, 2018
@beckjake beckjake deleted the docs-blocks branch August 6, 2018 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants