Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP Gateway Specs #283

Merged
merged 26 commits into from
Jul 1, 2022
Merged

Add HTTP Gateway Specs #283

merged 26 commits into from
Jul 1, 2022

Conversation

lidel
Copy link
Member

@lidel lidel commented Jun 3, 2022

This PR adds initial HTTP gateway specs under ./http-gateways directory.

👀 PREVIEW

💡 Why we need this

The goal of this work is to document the current behavior (based on implementation in go-ipfs 0.13-rc1).

This is a prerequisite that will allow us to switch future gateway improvements to be specs-driven:

  • IPFS stewards and community members will be able to propose improvements as a diff against the specs
  • Different implementations will be able to adopt changes without the need to follow and reimplement whatever go-ipfs does

🔎 Needs more eyes!

I did my best to document the most important aspects of the current state, but there is a lot to cover. If you see something is missing, or needs rephrasing (i am not a native speaker, and bad writer in general), please comment on this PR.

All feedback will be apprecieted

  • 👉 is something unclear or should be expanded? please comment inline on the specific section
  • 👉 is something missing entirely? please comment below
  • 💡 got an idea how to re-organize content to present this better? please comment below too!

Next

  • submit PR draft
  • sleep
  • add CODEOWNER
  • ping potential stakeholders for feedback
  • fill gaps identified during review
    • add Content-Range in range responses
    • subdomain gw: add ?uri= for registerProtocolHandler
    • path: add CORS section to Appendix
      • we can clarify this in separate PR if needed
    • subdomain: add TLS notes about wildcards
    • dir-index-html: note that dir listing should only fetch root node and if Size column is needed for UnixFS, it should be Tsize from logical format + hint more details for child items in a dirr could be lazy-loaded via ?format=dag-json and ?format=raw
  • (meantime) move TODO blocks into separate PRs, prototype and document the light RFC process
  • merge after 2-3 weeks of addressing feedback

This adds gateway specs under ./http-gateways directory.

The aim is to document _current_ behavior (implementation in go-ipfs 0.13)
and switch the way we do the gateway work to be specs-driven.

Long term goal is to provide language and implementation agnostic
specification that anyone can use to implement compatible gateways.
@lidel lidel self-assigned this Jun 3, 2022
Comment on lines 59 to 61
Defines the DNSLink name to resolve into `/ipfs/{cid}/` prefix that should be
prepended to the `path` before the final IPFS content path resolution is
performed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define that:

  • a dnslink that points to a /ipns path must be resolved to an immutable path first
  • a dnslink with a path component should be resolved to it's target CID

...path before appending the request path

Copy link
Member Author

@lidel lidel Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olizilla I've added this + note about recursion limit + an example in 7be0611 – lmk if that covers all the bases

Comment on lines +324 to +326
- `Cache-Control: no-cache, no-transform` should be returned with
`application/vnd.ipld.car` responses if the block order in CAR stream is not
guaranteed to be deterministic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding an informative note about this topic.

if the block order in CAR stream is not guaranteed to be deterministic

I don't really know how to parse this. By deterministic block order do we mean:

  • "block order can be recreated identically by the current implementation on a subsequent request" OR
  • "block order conforms to some canonical order defined by this other spec, see here, it's the best order, so feel free to caches these ones."

perhaps canonical order is the thing this is aiming at?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this specific spec cares about behavior on subsequent HTTP request.

This is one of the weaker parts of the spec:

  • added this paragraph as quick placeholder (second leg is Accept-Ranges (response header) which has the same caveat about block ordering)
  • afaik go-ipfs 0.13:
    • streams CAR in deterministic order because we walk the DAG the same time each time, but it is not documented anywhere, not it is a canonical thing (afaik)
    • streaming CAR means we don't know its total size without creating it in-memory first → can't set Content-Length correctly → range requests are not possible
      • lack of range requests over CAR could be a bug, or could be a feature: by not allowing range requests we could encourage use of IPLD selectors (for partial requests, resuming partial DAG downloads etc).

Anyway, if we had a spec for canonical order in CARs, mentioning it here would make things MUCH easier for implementers.

I think @alanshaw @hugomrdias worked on something like this? (can't recall the details..)

@lidel lidel mentioned this pull request Jun 7, 2022
18 tasks
Copy link
Member

@olizilla olizilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 great work! A foundation for the future of IPFS <-> HTTP bridging 🌈✨🌐

Copy link
Member

@vasco-santos vasco-santos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @lidel . Thanks for putting it together, it is super useful for everyone writing IPFS Gateways or building on top

http-gateways/PATH_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/PATH_GATEWAY.md Outdated Show resolved Hide resolved
resolving content paths that start with `/ipfs/` and `/ipns/`. It allows for
traversal based on link names, which provides a better user experience than
low level logical pathing from IPLD:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we recommend similar behaviour for case where given path is not valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos mind elaborating?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear. For instance, if I provide /ipfs/bafy...foo/this/path/does/not/exist, this path won't exist. My suggestion here would be if we should recommend a response as part of the spec so that each gateway send same response back.

I would love to have a better response then what we get as https://dweb.link/ipfs/bafkreidyeivj7adnnac6ljvzj2e3rd5xdw3revw4da7mx2ckrstapoupoq/test

Copy link
Member Author

@lidel lidel Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos added "Handling traversal errors" section here – lmk what additional info should be added in that section

http-gateways/README.md Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/PATH_GATEWAY.md Show resolved Hide resolved
http-gateways/PATH_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/PATH_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/DNSLINK_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/DNSLINK_GATEWAY.md Show resolved Hide resolved
http-gateways/README.md Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Outdated Show resolved Hide resolved
http-gateways/SUBDOMAIN_GATEWAY.md Show resolved Hide resolved
http-gateways/PATH_GATEWAY.md Show resolved Hide resolved
- Every `.` is replaced with `-`
- DNSLink label decoding
- Every standalone `-` is replaced with `.`
- Every remaining `--` is replaced with `-`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this account for ---?

Copy link
Member Author

@lidel lidel Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd number of --- here does not look like a valid encoding:

  • /ipns/foo-bar.example.comfoo--bar-example-com.ipns.dweb.link
  • /ipns/foo--bar.example.comfoo----bar-example-com.ipns.dweb.link
  • /ipns/foo---bar.example.comfoo------bar-example-com.ipns.dweb.link

Is someone requests foo---bar-example-com.ipns.dweb.link the first -- will be turned into single - producing foo--bar.example.com. Content will load only if such domain name exists and DNSLink exists.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are foo-.bar.example.com or foo.-bar.example.com valid DNS domains? Those would both translate to foo---bar.example.com, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, - is not allowed at the beginning, nor end of a DNS label:

The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen. There are also some restrictions on the length. Labels must be 63 characters or less.
https://datatracker.ietf.org/doc/html/rfc1035#section-2.3.1

@lidel
Copy link
Member Author

lidel commented Jun 28, 2022

PSA: I will merge this PR by the end of the week.

Feedback from key stakeholders was positive, and things added in the past few days were mostly clarifications,
which could be separate PRs that don't require IPIP.

@lidel lidel mentioned this pull request Jun 29, 2022
1 task
@lidel
Copy link
Member Author

lidel commented Jul 1, 2022

Thank you all for feedback, it made it way better than the initial version. ❤️
I am merging it.

If you were late to review this, please submit fixes or improvement proposals in new PRs.

PRs with new features such as new response types, new headers should include an Improvement Proposal document (template in ipfs/kubo#289) that explains motivation behind proposed change. It will make it easier to discuss / review. 🙏

@lidel lidel merged commit 0188a6d into main Jul 1, 2022
@lidel lidel deleted the feat/gateway-specs branch July 1, 2022 21:55
@lidel lidel restored the feat/gateway-specs branch July 1, 2022 21:56
lidel added a commit that referenced this pull request Jul 1, 2022
@lidel lidel deleted the feat/gateway-specs branch July 1, 2022 22:02
@lidel lidel restored the feat/gateway-specs branch July 2, 2022 15:04
@lidel lidel deleted the feat/gateway-specs branch July 2, 2022 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/community-input Needs input from the wider community
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

9 participants