Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway support for /ipfs/{cid}?format=car|raw|... #8234

Closed
2 of 3 tasks
lidel opened this issue Jun 30, 2021 · 10 comments · Fixed by #8758
Closed
2 of 3 tasks

Gateway support for /ipfs/{cid}?format=car|raw|... #8234

lidel opened this issue Jun 30, 2021 · 10 comments · Fixed by #8758
Assignees
Labels
effort/weeks Estimated to take multiple weeks kind/enhancement A net-new feature or improvement to an existing feature kind/feature A new feature P1 High: Likely tackled by core team if no one steps up
Milestone

Comments

@lidel
Copy link
Member

lidel commented Jun 30, 2021

@mikeal @olizilla @autonome @warpfork @aschmahmann – this is a quick memo with my initial synthesis of the ?format= idea. Bit thin on details, but want to get early feedback/temperature check before I start elaborating on this in gateway specs in the following weeks.

This is a meta issue for streamlining various feature requests and needs under a single opt-in query parameter that enables gateway users to fetch a specific representation of a specific content path.

Support for each format can be discussed/added via a separate issue/PR – this issue is just for tracking the bigger picture around unified format paramerer.

Note: if you need CARs from an ipfs gateway today, POST to /v0/api/dag/export?arg=<cid>, see: https://docs.ipfs.io/reference/http/api/#api-v0-dag-export

MVP formats

Ability to fetch every CID as full DAG in CAR or a single Block

This is the key feature to enable Verifiable Gateway Responses and "HTTP-based transport for IPFS" (mobile browsers, IoT) without introducing even more dependency on /api/v0, and giving us flexibility for adding new features in the future.

  • ?format=car – implemented in feat(gateway): Block and CAR response formats #8758
    • Returns binary stream with CAR archive for entire DAG behind the content path
    • Supersedes /api/v0/dag/export, but with better UX:
      • works on DNSLink websites that do not expose /api/v0
      • content-disposition defaults to {filename|cid}.car
  • ?format=block ?format=raw – implemented in feat(gateway): Block and CAR response formats #8758
    • Returns binary array with the root block identified by CID
    • Supersedes /api/v0/block/get, but with better UX:
      • works on DNSLink websites that do not expose /api/v0
      • content-disposition defaults to {cid}.bin

CBOR / JSON

Moved to #8823

Future ideas / lower priorities

Behaviors

  • ?format missing
    • if codec is dag-pb or raw return file/directory
      (current gateway behavior)
    • else (codec without default behavior), return error suggesting passing ?format=car|block|..
@lidel lidel added kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up kind/feature A new feature effort/weeks Estimated to take multiple weeks labels Jun 30, 2021
@lidel lidel added this to the go-ipfs 0.10 milestone Jun 30, 2021
@mikeal
Copy link

mikeal commented Jul 1, 2021

I like this querystring approach a lot. You can easily imagine extending this with new parameters for partial dag queries and the like.

One thing I’d like to specify is how an implementation exposes what formats it does and does not support. Then clients can implement fallback logic in order to be more robust and servers aren’t required to implement every feature ever.

@olizilla
Copy link
Member

olizilla commented Jul 1, 2021

It looks promising! Does it matter that it'd be mixing codecs ?format=dag-json and containers ?format=car. How do we deal with verisoning, would CARv2 be ?format=carv2? It's not called out in the issue, but is the plan to also honour an Accept header if provided.

else (codec without default behavior), return error suggesting passing ?format=car|block|..

can a dag-json be returned as json without needing to specify the format? I know this has come up before, but I can't recall why that would be bad.

and can has mime/types ipld/specs#368

@ribasushi
Copy link
Contributor

but I can't recall why that would be bad.

@olizilla relevant 🧵
#8037 (comment)

@lidel
Copy link
Member Author

lidel commented Jul 1, 2021

  • On default behavior
    • car and block would work for every CID, other things like json or cbor will be available per-codec-basis
    • When requested CID is not available in specified format (or does not have a default one), a human-readable error message suggesting appending explicit ?format=dag-json|dag-dbor|block|car is returned.
    • To facilitate things like automated fallback or format discovery via HEAD request, the list of supported formats could be included as HTTP header such as Link from RFC6249: Metalink/HTTP: Mirrors and Hashes  in-web-browsers#179:
      Link: <ipfs://bafy?format=block>; rel=describedby; type="application/octet-stream"
      Link: <ipfs://bafy?format=car>; rel=describedby; type="application/octet-stream"    
      Link: <ipfs://bafy?format=dag-json>; rel=describedby; type="application/json"
      Link: <ipfs://bafy?format=dag-cbor>; rel=describedby; type="application/cbor"
      
    • The concern about default behavior was about binary dag-cbor (thread linked above by @ribasushi).
      • For dag-cbor the default state will be error, as there is no obvious default response, and we may want to render some GUI on gateways in the future.
      • Returning application/json for dag-json by default (without explicit ?format) is ok.
  • On mixing codecs and containers: my reasoning for a single ?format= is that both codecs and containers translate to distinct response formats. HTTP client does not care about IPFS-specific taxonomy, it requests specific thing in specific format (Accept or ?format= ) and gets it.
    • Sidenote: I used explicit dag-cbor and dag-json just to highlight that CIDs will be traversable thanks to IPLD conventions, but we may shorten this to json and cbor to improve UX.
  • On CAR versions: AFAIK we have built-in versioning in CARs: CARv2 will includes version in the header in a backward-compatible way, so CARv1 parser will return "unsupported version" for CARv2. Due to this I see no need for versioning here.

@RangerMauve
Copy link

Would it make sense to also look at the Accept header for content types that the application might be expecting? e.g. if Accept contains application/json, return that for the URL. Feels like it'd be closer to what REST APIs already do and might fit well with some tooling.

@lidel
Copy link
Member Author

lidel commented Sep 16, 2021

Food for thought (cc @warpfork): mixing IPLD codec names with formats like car and block could be confusing to users.
Perhaps we should make IPLD codec override (because one is already in the CID) more explicit:

# fetch full DAG or a single block
?format=car 
?format=block

# request response parsed using implicit  IPLD lens (assume multicodec name when format is unknown)
?format=dag-json
?format=dag-cbor
?format=raw # same output as ?format=block (?)

# request response parsed using explicit  IPLD lens
?format=ipld&codec=dag-json
?format=ipld&codec=dag-cbor

# TBD - surface for IPLD selector queries
 ?format=ipld&codec=dag-json&selector={inlined_selector}
 ?format=ipld&codec=dag-cbor&selector={cid_of_a_complex_selector}

This explicit notation provides enough keywords to be self-explaining, and fairly easy to reason about their purpose without forcing users to read the docs.

For daily use, we could add a porcelain in for of a shorter notation where ?format=foo for unsupported foo will evaluate as ?format=ipld&codec=foo

@mikeal
Copy link

mikeal commented Feb 1, 2022

what ever happened to this?

@lidel
Copy link
Member Author

lidel commented Feb 17, 2022

@mikeal prioritization / limited bandwidth within stewards group – fleshing out details is still on the roadmap as part of gateway spec work, which I hope to get back to this quarter.

👉 If someone has bandwidth to make this happen sooner – I am all ears, happy to sync.

@lidel
Copy link
Member Author

lidel commented Mar 1, 2022

Note: car code was added in multiformats/multicodec#258 and the discussion around its meanign and purpose continues in multiformats/multicodec#239 (comment)

My take / question: can we use the codec field of CIDv1 to indicate expected format/transformation when requesting data from Grateways?

[..] convention where raw and car codecs are used on HTTP Gateway as a way of requesting a single Block or a CAR with blocks for a DAG.

  • HTTP GET /ipfs/{cid-with-raw-codec} returning a raw Block
  • HTTP GET /ipfs/{cid-with-car-codec} returning a CAR with the entire DAG behind a CID

In this convention the multihash in a CID represents the root block of a DAG, and if you plan to use car [code] with a multihash that has different meaning, we should agree on that now.

We could play it safe and use ?format=car (or shorter ?as=car) for now,
but things may be more intuitive if ?format=car returns a redirect to CIDv1 with car codec, and that would return a CAR stream.

@BigLep BigLep linked a pull request Mar 3, 2022 that will close this issue
24 tasks
@lidel lidel changed the title Gateway support for /ipfs/{cid}?format=car|block|... Gateway support for /ipfs/{cid}?format=car|raw|... Mar 10, 2022
@lidel
Copy link
Member Author

lidel commented Mar 10, 2022

Block/CAR response types are implemented in #8758 – ready for review, plan is to ship it in go-ipfs 0.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/weeks Estimated to take multiple weeks kind/enhancement A net-new feature or improvement to an existing feature kind/feature A new feature P1 High: Likely tackled by core team if no one steps up
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants