-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPIP-322: Content Routing Hints #322
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Introduction | ||
|
||
Content routing maps a CID to one or more *providers*, which specify locations where the CID can be fetched. | ||
|
||
In order to fetch content-addressed data, there must be *some* location addressing involved. With IPFS, the implicit default starting point is a set of bootstrap nodes. (And perhaps some LAN nodes discovered by mDNS, which has a starting point of the local subnet.) | ||
|
||
So far, Kubo has planned to keep this “location addressing” implicit by adding new content routers to the default Kubo config (e.g. Filecoin indexers). But this only solves the problem for whatever specific records are provided by that indexer, and those set of implicit content routers have to be supported by the various implementations to maintain the facade of “pure content addressing”. There are also trust issues in terms of automatically sending user data to indexers that users have not explicitly trusted. | ||
|
||
Instead of gateways and IPFS nodes implicitly sending all requests to a set of content routers that changes over time, and the community needing to reach consensus on what default routers to use, this proposes specifying that the default implicit content router is *only* the IFPS public DHT and LAN DHT, and all additional content routers must be opted-in by users when making API requests. | ||
|
||
# Specification | ||
|
||
The default implicit content router for IPFS nodes is the IPFS public DHT and LAN DHT. Any additional content routers must be opted-in by users when making API requests. | ||
|
||
Users may opt-in to additional content routers using “content routing hints”, which give *suggestions* to the IPFS node about where provider records for the given CID may be found. This can include, but is not limited to, Reframe URLs, pubsub topics, multiaddrs, etc. As hints, the IPFS node is free to decide the order and strategy for using hints. If an IPFS node implements support for a hint that is specified below, it must follow the specification for that hint type. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Allowing for content routing hints seems fine, but IMO doing this comes with some items that may need addressing:
|
||
|
||
When a node receives a request with content routing hints, it should search for provider records in the IPFS public DHT and at locations specified in the hints. | ||
|
||
## Hint Types | ||
|
||
Implementations are free to support hint types that make sense for their use cases. | ||
|
||
### URI | ||
|
||
- **Reframe** | ||
- HTTPS URL that ends with `/reframe` MUST be interpreted as a Reframe hint, for example: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do we specify different codecs like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is up to HTTP client (who controls I think details of
We probably want to register
(and looping in libp2p team for sanity check, if this is correct way of representing this) |
||
- [`https://cid.contact/reframe`](https://cid.contact/reframe) | ||
- [`https://routing.delegate.ipfs.io/reframe`](https://routing.delegate.ipfs.io/reframe) | ||
- **Magnet links (TBD, for consideration)** | ||
- “De facto standard” outside IPFS: [https://en.wikipedia.org/wiki/Magnet_URI_scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme) | ||
- **HTTP mirror AKA Web seed (TBD, for consideration)** | ||
- We could speed up data transfer of leaf nodes by making HTTP range-requests to the provided HTTP URL. | ||
- Bit out there, but could provide additional flexibility, especially when a URL of a public gateway is used. | ||
|
||
### Multiaddr (instant win) | ||
|
||
Multiaddrs alone provide a very flexible solution for routing hints. They enable control over the number of additional lookups that a client needs to make to reach the data: | ||
|
||
- `/ip4/A.B.C.D/tcp/NNN/p2p/{peerID}` | ||
- Removes need for any lookups, can try to connect directly and start data transfer | ||
- `/p2p/{PeerID}` | ||
- Saves 1 DHT lookup - we already know potential provider’s PeerID, only need to find their addresssed via `findpeer` (or similar) | ||
- On gateways this is highly cacheable | ||
- `/dnsaddr/{domain}` | ||
- Requires resolving [DNSAddr TXT records](https:/multiformats/multiaddr/blob/master/protocols/DNSADDR.md) on DNS, but allows big content storage services to scale / load-balance with ease, leverage DNS-based delegation to nodes that have data and are the closest | ||
- Could include fully resolved addresses, PeerIDs, or another DNSAddrs (with some sane recursion limit, could be the same as for resolving /ipns/ paths – 32) | ||
- Allows us to collapse a lot of complexity into a single DNS-based hint | ||
- 💡IDEA: we could implicitly check for DNSaddr on domains that have DNSLink | ||
- Opening `[https://dweb.link/ipns/en.wikipedia-on-ipfs.org](https://dweb.link/ipns/en.wikipedia-on-ipfs.org)` could make gateway implicitly check for DNSAddr for the domain at `_dnsaddr.en.wikipedia-on-ipfs.org` , that could have TXT records pointing at storage providers that have website data (TXT record `dnsaddr=/dnsaddr/storage-provider1.com`) | ||
- 💡IDEA: Since we have a valid DNS name, we could also check if `{domain}` exposes Reframe endpoint at `/reframe` | ||
- This would create a pretty elegant convention where URL hint is short (`/dnsaddr/service.com`, and at the same time allows for multiple types of routing hints to be passed this way. | ||
|
||
### PubSub Router Topic (future, TBD) | ||
|
||
- This one is for the future, needs additional design analysis, but we already have PoC for [IPNS over PubSub](https:/ipfs/go-ipfs/blob/master/docs/experimental-features.md#ipns-pubsub) and a “Generic” router is implemented in https:/libp2p/go-libp2p-pubsub-router | ||
- We could come up with an implicit or explicit protocol for joining a specific pubsub topic for requested content. | ||
- The implicit topic name could be based on the root CID of the requested path (allowing peers browsing the same DAG to participate in the same topic) | ||
- This could happen even without `?providers=` being present, but needs analysis how feasible it is to do this by default. | ||
- Even if this type of router is disabled by default, we could leverage the fact that `?providers=/dnsaddr/{domain}` is passed and create one. | ||
- Nodes could join a topic based on the DNS name from DNSaddr, allowing peers interested in the content from the same provider to exchange data directly over PubSub, skipping DHT or centralized Reframe endpoint. | ||
- A variant of this that is especially powerful. s when browsing DNSLink website or IPNS name. Mutable pointer would ensure people having old and new version of | ||
|
||
## Gateway Requests | ||
|
||
We would add support for an optional `?providers=` URL parameter ([percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding), comma-separated) or HTTP header `X-Ipfs-Providers` sent with HTTP request to a gateway. | ||
|
||
- `/ipfs/{cid}?providers=url,multiaddr,somethingelse?` | ||
- Example: `https://dweb.link/ipfs/bafy..acbd?providers=/dnsaddr/storage-provider1.com` | ||
Comment on lines
+67
to
+68
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In designing both delegated routing and the content routing setup, I thought the goal was to try to maintain the property we have today of content addressed data. if we have to specify the origin for the data, we've lost some of this property. I guess there's a benefit of discovery of content routers through this mechanism, but I would hope that's not the only way we learn about content routers in kubo. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We share the concern, and that is why this proposal. We will lose it for sure if a public gateway decides to use a specific set of indexers, and to not use others.
This proposal is not replacing By giving users the ability to pass additional routing hints, we remove the surface for undesired storage and content routing lock-in caused by the tyranny of the default. Gateway operators will be in control to set any custom peering and routing they wish, but the users should still be able to improve routing even further by providing an optional hint that the gateway/node can leverage for finding providers when the CID can't be found using conventional methods. This will be course-correcting any routing gaps that may occur. (Note to self to incorporate this into the spec) |
||
- `X-Ipfs-Providers: url,multiaddr,somethingelse` | ||
- Example: `X-Ipfs-Providers: /dnsaddr/storage-provider1.com` | ||
- Gateways will be free to leverage this hint to speed up content routing, or ignore it. | ||
- Allows public gateways to load content from services that do not announce CIDs on DHT (e.g., Pinata). | ||
|
||
## API Requests | ||
|
||
We would add optional `--providers` parameter, that allows for passing as-hoc hints that are scoped to specific command. | ||
Comment on lines
+74
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is specific to the kubo HTTP API, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we will move this to "Notes for implementers" section as an example of CLI API. |
||
|
||
### Prior art | ||
|
||
- [https://en.wikipedia.org/wiki/Magnet_URI_scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme) | ||
- DHT hash + optional list of HTTP URLs with trackers (~indexer’s reframe endpoints) | ||
- [routing.delegate.ipfs.io/reframe](http://routing.delegate.ipfs.io/reframe) | ||
- Example: | ||
|
||
```jsx | ||
magnet:?xt=urn:ipfs:[IPFS_CID] | ||
&dn=file_name.mp4 | ||
&x.ref=[REFRAME_URL_1] | ||
&x.ref=[REFRAME_URL_2] | ||
``` | ||
|
||
|
||
- In IPFS ecosystem | ||
- [Content routing hint via DNS records #6516](https:/ipfs/kubo/issues/6516) | ||
- [Content routing hint via HTTP headers #6515](https:/ipfs/kubo/issues/6515) | ||
- [https://discuss.ipfs.tech/t/proposal-peer-hint-uri-scheme/4649/21?u=lidel](https://discuss.ipfs.tech/t/proposal-peer-hint-uri-scheme/4649/21?u=lidel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section seems independent of the rest of the specification around specifying routing hints. I suspect it's also the most controversial given prior resistance to even defining a single or collection of routing systems as the "default/standard IPFS content routing systems".
This happens to be how the most popular and oldest IPFS implementations (e.g. kubo) have been operating over the last several years, but it could reasonably change over time. Overall this seems to be related to an independent discussion on what should be "required" for an IPFS implementation and/or if/how we should label a collection of protocols and properties that some IPFS implementations have that will make systems easier to reason about (e.g. Bitswap 1.2.0, IPFS Public DHT, libp2p with some set of transports and upgraders, etc.).
I'd try and separate this long requested and likely quite useful issue from the more nebulous "what is IPFS" kinds of conversations. Although that discussion seems like a separate important one to have and document the outcome of so it can be referenced and/or modified in the future.