Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go-ipfs does not store filesize on symlinks #195

Open
Gozala opened this issue Feb 4, 2022 · 3 comments
Open

go-ipfs does not store filesize on symlinks #195

Gozala opened this issue Feb 4, 2022 · 3 comments
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up

Comments

@Gozala
Copy link
Contributor

Gozala commented Feb 4, 2022

Looks like go-ipfs omits filesize in unixfs protobuf when you add ipfs add mysymlink e.g. see QmPZ1CTc5fYErTH2XXDGrfsPsHicYXtkZeVojGycwAfm3v but UnixFS.prototype.marshal does which results in different hashe.

@Gozala Gozala added the need/triage Needs initial labeling and prioritization label Feb 4, 2022
@lidel
Copy link
Member

lidel commented Apr 8, 2022

this looks like a bug, we prob. dont need to store size:

  • size of symlink itself is meaningless
  • size of destination could lead to bugs, because destination could change

unless there is a rationale for keeping it, my vote is to fix js-ipfs to do what go-ipfs does omit filesize

@lidel lidel added kind/bug A bug in existing code (including security flaws) help wanted Seeking public contribution on this issue P2 Medium: Good to have, but can wait until someone steps up and removed need/triage Needs initial labeling and prioritization labels Apr 8, 2022
@lidel
Copy link
Member

lidel commented Apr 8, 2022

@Gozala mind opening PR to fix this?

@john-heinnickel
Copy link

I've not been able to figure out how to communicate the presence of symbolic links to the js-ipfs-unixfs importer because there does not seem to be any examples of same in the README documentation. It sounds like there is an implementation to be found if I go looking around through the source tree, but that is time consuming an error prone on the downside.

I am having a little trouble predicting how symbolic links will be formed in a way that maintains reference semantics symmetry with a host system in light of changing content... In the native host system filesystem, a UnixFS view is patterned after, it is possible to change a file's contents without breaking links to that file, and it is possible to rename files such that symbolic links will break. Neither of these effects requires changing anything about a Symlink itself.

The options for collecting the bits that differentiate one symlink from another with regard to hash computation would seem to require either using the original source filesystem's name path, or a name path in terms of CID traversal taken from the UnixFS analogs of such nodes. Here we have some apparent problems with either scenario:

  • UnixFS does not seem to include name information for files and folders from the root of a filesystem. If I import the same file multiple times, I get the same CID, regardless of what I've named it. The UI client does seem to be maintaining source name information from these nodes somewhere, but apparently not in the model. The names for children of directories are apparently of semantic value to the directory nodes by their use in Link nodes--renaming children from the root directories, downward changes the CID of a directory, but as with files in the root, it does not affect the content of individual files therefore not their CID.

Is it possible to break symlinks to files in the root by renaming the linked files?

The alternative to storing symlinks with their "native" filesystem tokens would involve translating those tokens to CIDs. However, not every node in the linked path is necessarily imported, and as just discussed, moving/renaming/adding/removing children to a directory will change its CID, breaking links that those operations would not affect unless they involved the direct targets of such a link. Likewise, with respect to the target of a symbolic link, changing the content of a linked file would effectively modify its CID even if it was modified in place on the native host.

There seems to be an impedance mismatch with symlinks here. Links by reference in a source file system work precisely because filesystem names are a labeling technique that is orthogonal to file content, which is the antithesis of what IPLD's semantic model for naming is. Can these realistically co-exist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up
Projects
None yet
Development

No branches or pull requests

3 participants