Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ability to fetch dependencies via git+http and git+https protocol #14298

Closed
Tracked by #14265
andrewrk opened this issue Jan 13, 2023 · 10 comments · Fixed by #17277
Closed
Tracked by #14265

ability to fetch dependencies via git+http and git+https protocol #14298

andrewrk opened this issue Jan 13, 2023 · 10 comments · Fixed by #17277
Labels
contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase. zig build system std.Build, the build runner, `zig build` subcommand, package management
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jan 13, 2023

Extracted from #14265.

zig build should support fetching via an URL like this:

    .url = "git+https:/ziglang/zig.git#8b8090d7fad3e444784bc52db6a80188a9dbd3c0",

Note that the fragment is used to fetch a particular commit. I suppose the fragment could be omitted, meaning to fetch the latest HEAD of the default branch, however, this would be not advised since the hash would be wrong as soon as another commit is pushed to that branch. Ideally, if the fragment is omitted then an error would be emitted telling the user to add the fragment, giving them a copy+pasteable snippet, or perhaps even editing the manifest file on the user's behalf.

Open question: should it be built-in? or is this issue a request for a third party contributor to make a fetch plugin (#14294)?

I think the first step would be to implement this as a third-party fetch plugin, and then we can evaluate whether it can be upstreamed and become a builtin.

Related:

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. contributor friendly This issue is limited in scope and/or knowledge of Zig internals. zig build system std.Build, the build runner, `zig build` subcommand, package management labels Jan 13, 2023
@andrewrk andrewrk added this to the 0.12.0 milestone Jan 13, 2023
@daurnimator
Copy link
Contributor

Note that the fragment is used to fetch a particular commit. I suppose the fragment could be omitted, meaning to fetch the latest HEAD of the default branch, however, this would be not advised since the hash would be wrong as soon as another commit is pushed to that branch. Ideally, if the fragment is omitted then an error would be emitted telling the user to add the fragment, giving them a copy+pasteable snippet, or perhaps even editing the manifest file on the user's behalf.

Note that there are existing git urls for such schemes. In particular hashicorp has a widely adopted one documented here/here
For a reference it looks like e.g. github.com/kubernetes-sigs/kustomize/examples/multibases?ref=v1.0.6
Where the ref is a commit or tag or branch name

@uranusjr
Copy link

uranusjr commented Jan 13, 2023

As a reference, Python’s syntax for this is

git+https:/ziglang/zig.git@8b8090d7fad3e444784bc52db6a80188a9dbd3c0

A couple of things to note:

  1. No matter the syntax is, the ref name can contain URL-unsafe characters, so remember to specify percent escape!
  2. If you use Python’s @ syntax, you must require a ref name, otherwise the parsing rules can be ambiguous since @ is a valid character in the URL path. (e.g. https://mygitservice.com/@uranusjr/repo.git)

@andrewrk
Copy link
Member Author

Thank you, yes; let's follow an established convention where there is one.

@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jan 16, 2023
@nektro
Copy link
Contributor

nektro commented Feb 8, 2023

I would expect packages and dependencies to only define the git url and an artifact's lockfile to then record the commit and filesystem hashes. this would not only provide a deduplication mechanism if the same git repo was depended on by multiple packages but support both development and reproducible builds

@fuzhouch
Copy link

fuzhouch commented Aug 3, 2023

Regarding to Andrew's this question:

Open question: should it be built-in? or is this issue a request for a third party contributor to make a fetch plugin (https:/ziglang/zig/issues/14294)?

As a developer from Go world, I would like to share my two cents: I vote for build-in.

As far as I know, there are three existing practices in industry.

  1. Build as a separated tool other than compiler, but release in same package. -- Example: python/pip.
  2. Build as a separated, third-party tool, release in different projects. -- Example: gcc/CMake.
  3. Build as a plug-in or part of compiler. -- Example: go mod and go get commands.

I suggest Zig consider approach 3. This is based on my learning from zigmod and zig build commands, that zigmod is tightly coupled with zig build's API in master branch. A mixed use of zigmod and zig 0.10.x is indeed impossible. If we put dependency management outside zig release but put zig build inside, it add difficulties to dependency management to keep catching up development version of Zig.

Meanwhile, I think many people may prefer 1 and 2, based on an assumption that we can easily add new protocols in future without affect cadence of compiler release. This is true for Python, C/C++ or many languages. However, it works because their compiler does not offer feature for build management, but fully delegate it to third-party tools. For Zig case, it's not in this category, but more like Go (after go mod introduced).

@uranusjr
Copy link

uranusjr commented Aug 3, 2023

As a pip maintainer, I would suggest option 3. To maintain a feature like this requires specific knowledge that may not stay mainstream over time, but a tool supporting old fetching functionalities is stuck with maintaining them mostly forever. pip still supports fetching from Bazaar, for example. The feature is very much unmaintained and partially broken because no currently active maintainers know how to use Bazaar, but occasionally a user is misguided into using it since there’s no good way to communicate the feature is significantly less maintained than other parts of the tool. Splitting the functionality into a plug-in would help this a lot since installing the plug-in would be a conscious user decision, communicate the feature is maintained somewhat separately from the core, and encourage the user to take a look at how well the feature is maintained on its own.

@fuzhouch
Copy link

fuzhouch commented Aug 4, 2023

Hey, Regarding specifying branching, I'm thinking whether it's possible we support specifying branch name as well, something like:

    .url = "git+https:/ziglang/zig.git#branch-1.1@commitID",

As a guy from Go world, I know go mod does not specify branch name. IMHO, this is more a Go-speicific flavor best practice, that most project choose main/master branch as only release branch. This is fine for Go community but probably not best fit for Zig.

The reason is the working area. Many Go projects are cloud tools. which usually prefer a keep-moving-forward version upgrading policy, but do not actively maintain muliple parallel long-term supported release (LTS). Meanwhile, Zig is more like C/C++ for infra-level projects, Maintaining mulitple LTS is a common practice in this area. A good example is LLVM 16/17, that both are under maintenance for a long time. It should be a good idea if Zig's dependency management allows this flexibility.

@daurnimator
Copy link
Contributor

I'd suggest not directly supporting git; but instead direct people to make use of git archive based artifacts.

  • They are exposed for download by all git-web implementations I know of (e.g. githb, gitlab, cgit)
  • They are the "correct" way to support downloads of git repository without history. You can see this in e.g. .gitattributes support for how to pack archives and substitutions (which I recently learned about via LuaJIT/LuaJIT@33e2a49)
  • At least since ~2012, git archive output is deterministic
  • It means we don't have to add support for transports/protocols/custom reference mechanisms.

ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 23, 2023
Closes ziglang#14298

TODO: add better commit notes here
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 23, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 23, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 23, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 24, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 24, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 25, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data. It currently supports only version 2 of Git's
wire protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was
first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 25, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.

Git URLs can be specified in `build.zig.zon` as follows:

```zig
.xml = .{
    .url = "git+https:/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```

The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):

```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https:/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https:/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```

This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
@ianprime0509
Copy link
Sponsor Contributor

I do think there's something to be said for the discoverability of git+http(s) URLs vs archive URLs, where the latter are (at least judging from my personal experience) not as well known, and other package managers (NPM, Go, Cargo, etc.) broadly have support for Git URLs.

Fortunately, with shallow clones, the download size of cloning a Git repository for a single commit is very close to that of the equivalent tar.gz.

@daurnimator
Copy link
Contributor

Fortunately, with shallow clones, the download size of cloning a Git repository for a single commit is very close to that of the equivalent tar.gz.

This doesn't fill in fields from .gitattributes.

I do think there's something to be said for the discoverability of git+http(s) URLs vs archive URLs, where the latter are (at least judging from my personal experience) not as well known, and other package managers (NPM, Go, Cargo, etc.) broadly have support for Git URLs.

IMO adding git support will result in a neverending list of subsequent feature requests. e.g.

  • control over cloning submodules or not
  • using git credential helpers
  • git lfs support
  • support for various git config minutae
  • unknown future git features
  • fallback to exec-ing the git executable

Other package managers have been subject to these sorts of requests to varying degrees of implementation and success. But also the wider ecosystem (which includes e.g. tools like dependabot/renovate that manage lock files for users) has had to find all sorts of package manager specific git configuration. e.g. cargo has CARGO_NET_GIT_FETCH_WITH_CLI to fall back to the git executable.

ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 29, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.

Git URLs can be specified in `build.zig.zon` as follows:

```zig
.xml = .{
    .url = "git+https:/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```

The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):

```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https:/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https:/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```

This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
ianprime0509 added a commit to ianprime0509/zig that referenced this issue Sep 30, 2023
Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.

Git URLs can be specified in `build.zig.zon` as follows:

```zig
.xml = .{
    .url = "git+https:/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```

The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):

```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https:/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https:/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```

This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
andrewrk pushed a commit that referenced this issue Oct 1, 2023
Closes #14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.

Git URLs can be specified in `build.zig.zon` as follows:

```zig
.xml = .{
    .url = "git+https:/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```

The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):

```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https:/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https:/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```

This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. #14295).
@andrewrk andrewrk modified the milestones: 0.14.0, 0.12.0 Oct 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase. zig build system std.Build, the build runner, `zig build` subcommand, package management
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants