IPLD merkle-path improvements #62

mildred · 2016-01-08T23:40:19Z

Improves PR #37 and replaces PR #60. The idea is that there is only one kind of merkle-paths that are not to be confused with unixfs paths. These paths are powerful enough to be able to access multiple properties in IPLD objects and resolve merkle links.

jbenet · 2016-01-08T23:52:45Z

merkledag/ipld.md

+
+The link will:
+
+- look up the first object `QmUmg7BZC1YP1ca66rRtWKxpXp77WgVHrnv263JtDuvs2k` that we call `root`


let's maybe call this one object0 instead of root?

jbenet · 2016-01-09T00:01:00Z

@mildred thanks, this is a strong solution possibility.

(re the unixfs paths being different, /ipfs/ links on the gateway (which are sort of unixfs paths), for example, output the concatenated data of a file, instead of the raw objects, and have directory listings, and we cannot access children links of files there)

mildred · 2016-01-09T00:06:35Z

the unixfs paths being different

That's a part I missed and it created much confusion, sorry for that.

jbenet · 2016-01-09T00:09:59Z

merkledag/ipld.md

-A _merkle-path_ is a unix-style path (e.g. `/a/b/c/d`) which initially dereferences through a _merkle-link_ and then follows _named merkle-links_ in the intermediate objects. Following a name means looking into the object, finding the _name_ and resolving the associated _merkle-link_.
+A merkle-path is a unix-style path (e.g. `/a/b/c/d`) which initially dereferences through a _merkle-link_ and allows access of elements of the referenced node and other nodes transitively.
+
+Merkle paths aren't suited to be used in filesystem representations (fuse mounts, HTTP or FTP protocols) as they describe the underlying IPLD data structure. Their use in filesystems is howver well suited for debug purposes (like `/proc` on unix).


when we say this, do we mean:

these paths are not good to represent files because they allow access to the raw structures underneath

or, these paths are not good to be accessed via the web or the filesystems, because there are UI problems.

i do want to be able to use these paths via HTTP to inspect the underlying data structures. but it's ok if i cannot get a proper file or directory representation out of it.

the latter part of this statement makes me somewhat more at peace with any of:

disallowing access of link properties via the paths

disallowing transparent dereferencing without the use of $obj/link/

separating "the link" and "the link properties" (same as the previous one in a way)

i think what we really need here is more concrete datastructure examples, and see how the pathing would want us to get to the stuff. maybe this will show us which solution is the best and guide our thinking.

perhaps we could write:

an fs example (directories, sharded files, file attributes in dir entries)

a version control example (commit histories)

a social network example (users, messages, relationships) (maybe something like a simple version of foaf)

a crypto network example (keys, derived keys, signatures, certificates, attestations) (i'll write this one at least)

a blockchain example (block histories, transactions, etc)

(any other killer use case for ipld that we could model here?)

What I meant is these paths are not good to represent files because they allow access to the raw structures underneath. That's why I took the analogy of the / proc filesystem which is not good for storing user files, but is still valuable as a filesystem. I might not have been clean enough, it's late at night in France and I need sleep :-)

There might be UI problems as the . separator is not understood as such by browsers for example. I don't think this will be a real problem though.

jbenet · 2016-01-09T00:13:36Z

That's a part I missed and it created much confusion, sorry for that.

not at all, they look the same right now, which is definitely confusing. we could:

leave /ipfs/... for unixfs
use /ipld/... for raw ipld data.

not sure. in the end i still have trouble with the unixfs--ipld dichotomy. im not sure how to get unixfs to play well with other things (like commits). it is likely that:

we will have a 1 + N datastructure and 1 + N path world (1 for ipld and N for every datastructure)
or a 1 + N datastructure but 1 + 1 path world (1 for ipld and 1 for the derived datastructs together).

the latter is hard but if we can get it it would be much easier to reason about. will be very difficult to think of different datastruct-specific paths :/

jbenet · 2016-01-09T00:15:17Z

i do indeed recognize that there is value to try to get everything to be one path system, for example, give up on link properties and make this:

{
  "cat.jpg": {"link": "Qmcatjpg..."},
  "foo": {"link": "Qmfoo..."},
  "@attrs": {
    "cat.jpg": {
      "mode": 777,
      "owner": "jbenet"
    }
  }
}

://////

mildred · 2016-01-09T00:20:37Z

Your example works well because it represents a file structure that is well represented with unix-like paths. But what if we try to represent anything else. What if keys contain binary data ? You need an escape mechanism.

Or perhaps I'm not thinking strainght as i am tried. In any case, the path definition is not impacting (or should not impact) the IPLD format in itself. So perhaps that's an issue that can be resolved later?

mildred · 2016-01-09T13:25:10Z

I rephrased the paragraph you commented here to remove ambiguities. You talked about examples, but they are already there at the end of ipld.md.

If you want to unify every paths under the same prefix, how would that work in practice? For example, how will /ipns/ be managed (or am I lagging behind old concepts)?

mildred · 2016-01-09T13:29:46Z

{
  "cat.jpg": {"link": "Qmcatjpg..."},
  "foo": {"link": "Qmfoo..."},
  "@attrs": {
    "cat.jpg": {
      "mode": 777,
      "owner": "jbenet"
    }
  }
}

here, you won't be able to access both @attrs and cat.jpg using the same path mechanism, unless you start telling people they can't name a file @attrs.

I don't think you can force people to use the same path for everything. It's not practical. Some applications might need some things not provided by default.

jbenet · 2016-01-22T06:02:32Z

@mildred I've made https:/ipfs/ipld-examples to try and resolve this question. So far i've only made two examples unixfs and post. take a look, i changed things up somewhat after giving more thought. I'm still not decided, but i'm seeing how horrible some of those options are.

could you please double check my work? im not sure i got everything right-- I may have messed up some of them, given that there are things you mentioned were problems but i didnt run head into them. (escaping . when using it as a delimiter, for example. may be that this could be a non issue, but anyway, its separate).

I think so far, my favorites are (8), (4), and (5). -- the other feel odd or are very, very confusing.

Also, see the unixfs pathing-- i made it a separate thing there too, and i am more convinced this is the right thing to do.

drvirgilio · 2016-01-28T23:31:01Z

I think . should not be used to do traversal. I think \. should. This way there is no need to escape for key names such as notes.txt which I think would occur more often.

mildred · 2016-02-11T21:58:18Z

I updated this branch to ipld-spec and added my understanding of the path mechanism (8)

jbenet · 2016-02-12T07:41:03Z

merkledag/ipld.md

-A _merkle-path_ is a unix-style path (e.g. `/a/b/c/d`) which initially dereferences through a _merkle-link_ and then follows _named merkle-links_ in the intermediate objects. Following a name means looking into the object, finding the _name_ and resolving the associated _merkle-link_.
+A merkle-path is a unix-style path (e.g. `/a/b/c/d`) which initially dereferences through a _merkle-link_ and allows access of elements of the referenced node and other nodes transitively.
+
+_Merkle-paths_ aren't suited for using them in a general purpose filesystem because it introduces many restrictions on file names. However, it can be used to work on special purpose filesystems. It can be compared to the `/proc` filesystem on unix computers or HTTP Web APIs where the allowed paths is restricted.


I think we should ditch this paragraph -- it's not totally accurate, FSes can be implemented, just the hoops get ugly (dirA/@link/dirB/@link/fileC/data), and we want something cleaner. I think the unixfs spec on top of IPLD can discuss the choices there.

Ok, as long as I can store a file named @link on my unixfs filesystem (which would at one point in some future would become my root filesystem I'd imagine).

jbenet · 2016-02-12T07:41:22Z

merkledag/ipld.md

+
+_Merkle-paths_ aren't suited for using them in a general purpose filesystem because it introduces many restrictions on file names. However, it can be used to work on special purpose filesystems. It can be compared to the `/proc` filesystem on unix computers or HTTP Web APIs where the allowed paths is restricted.
+
+General purpose filesystems are encouraged to design an object model on top of IPLD that would be specialized for file manipulation and have specific path algorithms to query this model.


i think we can keep this, and it expresses enough.

jbenet · 2016-02-12T09:09:26Z

@mildred things look good. I'm good to leave the cases there for now, and decide on the lenient vs strict down the road. we can implement strict for now and see how it goes?

jbenet · 2016-02-12T09:17:07Z

I'm good to merge as is, and continue from there.

jbenet · 2016-02-12T09:17:30Z

@mildred lmk if you are ready too, and i'll merge. else what else to do?

mildred · 2016-02-12T09:23:05Z

I'm ok with this as well.

IPLD merkle-path improvements

jbenet added the backlog label Jan 8, 2016

mildred mentioned this pull request Jan 8, 2016

WIP: IPLD spec #37

Merged

5 tasks

jbenet reviewed Jan 8, 2016
View reviewed changes

jbenet reviewed Jan 9, 2016
View reviewed changes

This was referenced Jan 9, 2016

IPLD CBOR tagging #61

Merged

Separate filesystem merkle-path from IPLD merkle-path #60

Closed

Relationship with Protocol Buffers legacy IPFS node format #59

Merged

mildred mentioned this pull request Jan 24, 2016

Implement the IPLD spec ipld/go-ipld-deprecated#20

Open

8 tasks

Shanti Bouchez-Mongardé and others added 8 commits February 11, 2016 22:56

Separate filesystem merkle-path from IPLD merkle-path

a15797a

Add note about uses of filesystem merkle-paths

24bd624

Talk about escaping keys in merkle-paths

ec26862

Consideration about key escaping

201a0d4

Go back to a single merkle-link with two separators

e1a1071

IPLD merkle paths: typo fixes

5062f8c

ipld merkle paths: clarify the usage scope

bb5ad86

New description of merkle-paths

213c7ed

mildred force-pushed the ipld-spec-linkprop2 branch from 9e50c44 to 213c7ed Compare February 11, 2016 21:57

jbenet reviewed Feb 12, 2016
View reviewed changes

Improve spec on paths

2ee6909

jbenet added a commit that referenced this pull request Feb 12, 2016

Merge pull request #62 from mildred/ipld-spec-linkprop2

b1d4bd7

IPLD merkle-path improvements

jbenet merged commit b1d4bd7 into ipfs:ipld-spec Feb 12, 2016

jbenet removed the backlog label Feb 12, 2016

mildred mentioned this pull request Feb 12, 2016

Encoding for IPLD paths #77

Closed

daviddias added the IPLD label Mar 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPLD merkle-path improvements #62

IPLD merkle-path improvements #62

mildred commented Jan 8, 2016

jbenet Jan 8, 2016

jbenet commented Jan 9, 2016

mildred commented Jan 9, 2016

jbenet Jan 9, 2016

mildred Jan 9, 2016

jbenet commented Jan 9, 2016

jbenet commented Jan 9, 2016

mildred commented Jan 9, 2016

mildred commented Jan 9, 2016

mildred commented Jan 9, 2016

jbenet commented Jan 22, 2016

drvirgilio commented Jan 28, 2016

mildred commented Feb 11, 2016

jbenet Feb 12, 2016

mildred Feb 12, 2016

jbenet Feb 12, 2016

mildred Feb 12, 2016

jbenet commented Feb 12, 2016

jbenet commented Feb 12, 2016

jbenet commented Feb 12, 2016

mildred commented Feb 12, 2016


		The link will:

		- look up the first object `QmUmg7BZC1YP1ca66rRtWKxpXp77WgVHrnv263JtDuvs2k` that we call `root`


		_Merkle-paths_ aren't suited for using them in a general purpose filesystem because it introduces many restrictions on file names. However, it can be used to work on special purpose filesystems. It can be compared to the `/proc` filesystem on unix computers or HTTP Web APIs where the allowed paths is restricted.

		General purpose filesystems are encouraged to design an object model on top of IPLD that would be specialized for file manipulation and have specific path algorithms to query this model.

IPLD merkle-path improvements #62

IPLD merkle-path improvements #62

Conversation

mildred commented Jan 8, 2016

jbenet Jan 8, 2016

Choose a reason for hiding this comment

jbenet commented Jan 9, 2016

mildred commented Jan 9, 2016

jbenet Jan 9, 2016

Choose a reason for hiding this comment

mildred Jan 9, 2016

Choose a reason for hiding this comment

jbenet commented Jan 9, 2016

jbenet commented Jan 9, 2016

mildred commented Jan 9, 2016

mildred commented Jan 9, 2016

mildred commented Jan 9, 2016

jbenet commented Jan 22, 2016

drvirgilio commented Jan 28, 2016

mildred commented Feb 11, 2016

jbenet Feb 12, 2016

Choose a reason for hiding this comment

mildred Feb 12, 2016

Choose a reason for hiding this comment

jbenet Feb 12, 2016

Choose a reason for hiding this comment

mildred Feb 12, 2016

Choose a reason for hiding this comment

jbenet commented Feb 12, 2016

jbenet commented Feb 12, 2016

jbenet commented Feb 12, 2016

mildred commented Feb 12, 2016