Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: "Namespace" user-written Markdown headings #91759

Open
camelid opened this issue Dec 10, 2021 · 31 comments
Open

rustdoc: "Namespace" user-written Markdown headings #91759

camelid opened this issue Dec 10, 2021 · 31 comments
Assignees
Labels
A-rustdoc-ui Area: Rustdoc UI (generated HTML) C-enhancement Category: An issue proposing an enhancement or a PR with one. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@camelid
Copy link
Member

camelid commented Dec 10, 2021

Tracking issue for @jsha's idea mentioned here. What follows is @jsha's summary of the idea:

When markdown like # Examples is processed, it usually turns into something like <a href="#examples" id="examples">. This is useful so you can click on the heading and get a link that will take someone else to that precise part of the docs.

Since the markdown in rustdoc is user-generated, those anchor ids may conflict with Rustdoc's own anchor ids. They may also conflict with other markdown sections within the same doc page. For instance, see:

https://doc.rust-lang.org/nightly/std/string/struct.String.html#examples
https://doc.rust-lang.org/nightly/std/string/struct.String.html#examples-1
https://doc.rust-lang.org/nightly/std/string/struct.String.html#examples-2

Right now we disambiguate these ids by added a number at the end. However, it would be better to disambiguate them by namespacing. Specifically, each time we render markdown we should provide a "prefix", and all IDs in the generated HTML should start with that prefix. In general a convenient and sensible choice for this prefix would be the id of the immediately preceding heading. So the examples linked above might become #top.examples, #method.new.examples, and #method.from_utf8.examples.

This has three advantages:

  • It systematically removes most of the cases of id conflict.
  • It makes anchor links more meaningful when someone reads the URL.
  • It makes anchor links stable across revisions.

This is a 99% solution, not a 100% one. Users can author HTML directly in their markdown, for instance <div id="foo">. But we are okay with letting the conflicts happen in those rare cases.

@camelid camelid added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-rustdoc-ui Area: Rustdoc UI (generated HTML) labels Dec 10, 2021
@GuillaumeGomez
Copy link
Member

I'll work on this.

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

@rfcbot fcp merge

Overall, I'm in favor of this idea; however, I'd still like to have a discussion about breakage before we merge in.

See #92043 (comment) for my concerns.

@rfcbot
Copy link

rfcbot commented Dec 17, 2021

Team member @camelid has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Dec 17, 2021
@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

@rfcbot concern breakage

I'd like to discuss this and for us to all be on the same page before this is merged.

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

@rfcbot concern impls

I think we need to namespace by impl too.

For example, the primitive.pointer.html page has impls for both *const T and *mut T. The methods on each are currently disambiguated using the number suffixes (cc @scottmcm who ran into issues with this), but after this change, I think they would conflict. IIUC, the current implementation in #92043 only namespaces by the parent item (e.g., method name), but what about when there are two impls on a page with the same method names?

Note that this occurs for some user types too, not just the pointer page.

@jsha
Copy link
Contributor

jsha commented Dec 17, 2021

Good point. Here's a concrete example:

https://doc.rust-lang.org/nightly/std/any/trait.Any.html#examples-7
https://doc.rust-lang.org/nightly/std/any/trait.Any.html#examples-2

Those are the examples section for <dyn Any + 'static>::downcast_ref and <dyn Any + Send + 'static>::downcast_ref, respectively.

The links to those impls themselves have the same "incrementing number" problem: https://doc.rust-lang.org/nightly/std/any/trait.Any.html#impl and https://doc.rust-lang.org/nightly/std/any/trait.Any.html#impl-1. So it would be good to fix that at the same time we fix this.

For instance, we might make the link to the first impl #impl.dyn-any-static, and the second #impl.dyn-any-send-static. Then the examples sections could be #impl.dyn-any-static.method.downcast_ref.examples or similar.

Note: there is prior art for certain impls getting an anchor that reflects their type parameters. For instance see https://doc.rust-lang.org/nightly/std/string/struct.String.html#impl-Add%3C%26%27_%20str%3E.

This happens in the "Implementations" section of a struct page. It doesn't happen in the "Implementors" section of a trait page.

Note that the String anchor linked above is over-encoded. If we wind up tweaking generation of impl targets I'd like something with the right level of encoding. See this demo.

<a id="impl-Add<&'_ str>" href="#impl-Add%3C&'_ str%3E">target</a>

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

I agree with what you say, but I mean that headings within methods within impls need to be namespaced like #impl-Add.method.add.examples. My understanding (I may be incorrect) of Guillaume's current PR is that it would create the fragment #method.add.examples, which would conflict with other impls of Add.

@camelid camelid changed the title rustdoc: "Namespace" user-written Markdown headers rustdoc: "Namespace" user-written Markdown headings Dec 17, 2021
@jsha
Copy link
Contributor

jsha commented Dec 17, 2021

Yep, we're on the same page - I was intending to agree and expand. In order to fix the conflict between same-named methods, we need to namespace by impl plus method name; that in turn brings up the parallel issue that id assignment for impls has a very similar problem, and we should fix both at once.

@GuillaumeGomez
Copy link
Member

Both of what you suggested (keeping the whole "path" and changing the impls ID) will greatly increase the size of the generated IDs and therefore the size of the generated output. I don't think it's worth it to keep the whole path, however changing the impls ID sounds like a good idea.

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

We need to keep the whole path or we'll introduce ID conflicts.

@GuillaumeGomez
Copy link
Member

No, it's handled by the IdMap just like currently.

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

But then we have the same issue of the IDs not being stable. I think we should make our IDs simpler and "more correct" by ensuring they are truly unanmbiguous (or reduce ambiguity as much as possible). Server–browser compression will probably mitigate the page size cost, right?

@GuillaumeGomez
Copy link
Member

As I understood, the main goal was to not have conflicts with rustdoc's IDs. With the current implementation it's already quite limited.

@jsha
Copy link
Contributor

jsha commented Dec 17, 2021

Both of what you suggested (keeping the whole "path" and changing the impls ID) will greatly increase the size of the generated IDs and therefore the size of the generated output.

One possible size optimization: for types with a single impl block, omit the impl prefix and just do #method.foo.examples.

By the way, a minor problem with my proposal above about generating better ids for impl blocks: a type can have multiple impl blocks with the same type parameters (or no type parameters), so we would still need some form of collision-renaming like we have today.

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

One possible size optimization: for types with a single impl block, omit the impl prefix and just do #method.foo.examples.

Then adding an impl will still be cause link breakage. The page size savings don't seem worth it to me. IMO it'd be better for rustdoc's rules to be simple and correct.

By the way, a minor problem with my proposal above about generating better ids for impl blocks: a type can have multiple impl blocks with the same type parameters (or no type parameters), so we would still need some form of collision-renaming like we have today.

Note that that's only a problem for the impl headings themselves since the methods within must have unique names or rustc will get upset about naming conflicts.


I think this has gotten a bit off-topic, sorry about that! The goal of this proposal is to prevent conflicts and reduce instability for user-written Markdown headings.

I think we should start another proposal to reduce instability of rustdoc-generated headings (impls and associated items).

@camelid
Copy link
Member Author

camelid commented Dec 17, 2021

Ok, I've opened #92052. I haven't started FCP there yet because I think we should finish with this change first to avoid too discussing too many changes at once.

@rfcbot resolve impls

I think the current approach should probably be good enough. I think it should be overall an improvement.

@GuillaumeGomez
Copy link
Member

@rfcbot approve

@camelid
Copy link
Member Author

camelid commented Jan 6, 2022

@Nemo157 could you review this when you get a chance? If you have any concerns, I'd be happy to discuss them here.

@camelid
Copy link
Member Author

camelid commented Jan 10, 2022

Thanks Nemo! The only thing blocking this now is my placeholder "breakage" concern. Is everyone okay with the breakage this change will cause?

@rfcbot concern anchor formatting

The one other concern I might propose is the choice of formatting for the anchor prefixes. Should they (roughly) match their parent, e.g., #method.foo.my-anchor-here? Or should they be just based on namespaces, e.g., #v.foo.my-anchor-here? Or something else? We could always change this later, perhaps with a JS-powered anchor redirect for old anchors, but it'd be best to avoid doing that if possible.

@GuillaumeGomez
Copy link
Member

What is the v in v.foo...? But otherwise, either way is fine to me. The second has the advantage to be smaller though.

@Manishearth
Copy link
Member

Manishearth commented Jan 11, 2022

value, since rust has value, type, and macro namespaces. But the namespaces don't matter here, for associated stuff it's all the same namespace, I think.

Edit: nope, you can class associated items. But also, you can clash fields and methods and associated types too; I suspect we should use the expanded name since either way this isn't following the regular import namespace bucketing.

@jsha
Copy link
Contributor

jsha commented Jan 11, 2022

Should they (roughly) match their parent, e.g., #method.foo.my-anchor-here

This is my preferred approach.

@GuillaumeGomez
Copy link
Member

Let's go for the expanded name then? It's currently how it's implemented in any case so less work for me. :3

@scottmcm
Copy link
Member

Edit: nope, you can class associated items.

Yeah, there's a couple weird cases here -- see #76347 for example.

@camelid
Copy link
Member Author

camelid commented Jan 13, 2022

@rfcbot resolve anchor formatting

Seems good for now, we can probably figure out a way to change it later if necessary.

@camelid
Copy link
Member Author

camelid commented Jan 13, 2022

@rfcbot resolve breakage

No one's raised concerns about the breakage (this concern was just a placeholder), so I'll resolve this concern. The links broke very easily before anyway.

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Jan 13, 2022
@rfcbot
Copy link

rfcbot commented Jan 13, 2022

🔔 This is now entering its final comment period, as per the review above. 🔔

@jsha
Copy link
Contributor

jsha commented Jan 22, 2022

@rfcbot concern simplify

I'd like to propose a slightly simpler alternative: We could not to generate anchors for most user-generated markdown.

The headings in the top-doc are usually substantive, and worth linking to on their own, so we could namespace those. For instance https://doc.rust-lang.org/nightly/std/string/struct.String.html#utf-8 would become https://doc.rust-lang.org/nightly/std/string/struct.String.html#top-doc.utf-8.

But for other places, like https://doc.rust-lang.org/nightly/std/string/struct.String.html#examples-23, users could just as well link to the parent item, https://doc.rust-lang.org/nightly/std/string/struct.String.html#method.as_bytes. For most methods and other items outside the top-doc, having links to specific user sub-headings may not be worth the complexity we add by trying to namespace all of them.

Rustdoc currently never generates internal links to user sub-headings, so these anchors are purely an external-facing interface. Right now that interface is mostly broken - the numbered anchors are very subject to breakage. Rather than fixing that interface, maybe we should narrow it.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jan 22, 2022
@camelid
Copy link
Member Author

camelid commented Jan 22, 2022

Hmm, seems interesting. I don't feel super strongly either way. What do other people think?

It could potentially simplify the UI as well since we'd no longer have to make non-top headings into links.

@GuillaumeGomez
Copy link
Member

This is an interesting take and would simplify rustdoc code a lot. I think it's a good idea so I'm favour for it but not feeling strongly about it either.

@camelid
Copy link
Member Author

camelid commented Jan 23, 2022

If we do end up doing this, I think we should still keep anchors for headings in impl docs. Just a couple days ago I used those to split up some docs into sections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-ui Area: Rustdoc UI (generated HTML) C-enhancement Category: An issue proposing an enhancement or a PR with one. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants