Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect/Extra Data included in sbom. #20

Closed
dlorenc opened this issue May 22, 2021 · 13 comments
Closed

Incorrect/Extra Data included in sbom. #20

dlorenc opened this issue May 22, 2021 · 13 comments
Assignees
Labels
Milestone

Comments

@dlorenc
Copy link

dlorenc commented May 22, 2021

I tried out this tool on an open source repo I maintain called cosign. The repo is here: https:/sigstore/cosign

I ran the tool at cosign commit 749c7e3e5d80f3fa976f31084317a556718c3e54.
The cyclonedx-gomod version I used was 35b214b43eabfdf6b47499427b93c23ae1f903aa.

I ran: $ cyclonedx-gomod -json > gomod.json

The resulting sbom is available here: https://gist.github.com/dlorenc/17c0582acb0560807e561e436cf76a1e

The tool worked, but a quick look at the data shows some packages that are not included in my application. For example:

"components": [
    {
      "bom-ref": "pkg:golang/bazil.org/[email protected]",
      "type": "library",
      "name": "bazil.org/fuse",
      "version": "v0.0.0-20180421153158-65cc252bf669",
      "scope": "required",
      "hashes": [
        {
          "alg": "SHA-256",
          "content": "14d091a578aab86d5aa32a9c2165459a94d2295731d9b403dfcb9965e1ad765c"
        }
      ],
      "purl": "pkg:golang/bazil.org/[email protected]"
    },

bazil.org/fuse does not appear in my application at all, including it's dependency tree. I do not vendor my code normally (I saw the warnings about this tool only partially supporting vendoring). But if I run go mod vendor, I do not see bazil.org/fuze anywhere in the vendor tree. I can build and run my code without that package, so I think it's a bug that it appears in the SBOM.

You can see a test branch with the vendor directory here: https:/dlorenc/cosign/tree/vendor/vendor/github.com

Note that this is just the first component in the list and the first that I checked. There may be other things missing or incorrect.

Am I misunderstanding the fields? Is there some reason this is included?

Note: This does appear in my go.sum file, but not in the vendor directory and it never makes it into my compiled application, as can be shown by the vendor directory. This means it is used somehow in the dependency constraint calculation by "go mod".

Note that there are

@nscuro
Copy link
Member

nscuro commented May 22, 2021

Hi @dlorenc, thank you for the detailed report.

What you're seeing is the result of cyclonedx-gomod (or Go's own module tooling rather) not knowing which modules end up in the final product.

We use go list -m and go mod graph. Both operate on the module graph, which is decoupled from the build. Go's module graph considers all modules, including transitive test dependencies (ref golang/go#26955).

Regarding bazil.org/fuse specifically, one possible dependency chain towards bazil.org/fuse appears to be:

github.com/sigstore/[email protected]
|-+ github.com/sigstore/[email protected]
  |-+ [email protected]
    |-+ github.com/GoogleCloudPlatform/[email protected]
      |-+ bazil.org/[email protected]

Searching github.com/sigstore/rekor for gocloud.dev showed that it's only ever used in cmd/rekor-server, which in turn is not used by github.com/sigstore/cosign.

Based on that, gocloud.dev and all its dependencies won't ever end up in your binary. But because it's a dependency of the github.com/sigstore/rekor module, it ends up in the module graph and consequently in your SBOM.

go mod vendor won't copy gocloud.dev to vendor/, because in contrast to go list -m and go mod graph, it "knows" what's required to build your module. But even then, it'll consider both of your commands (cmd/cosgin, cmd/sget), which probably don't depend on the same modules either.

I think we could use something along the lines of go list -deps -f '{{with .Module}}{{.Path}}@{{.Version}}{{end}}' ./cmd/cosign | sort -u to figure out which modules are required to build cmd/cosign. With that knowledge, we should be able to tidy up the SBOM quite a bit. I will experiment with this.

Edit: In essence, what cyclonedx-gomod is currently missing is analysis of package imports.

This is what I get for the "go list" command above. Looks somewhat more reasonable.
cloud.google.com/[email protected]
github.com/asaskevich/[email protected]
github.com/blang/[email protected]+incompatible
github.com/cavaliercoder/[email protected]
github.com/containerd/stargz-snapshotter/[email protected]
github.com/coreos/go-oidc/[email protected]
github.com/cyberphone/[email protected]
github.com/danieljoos/[email protected]
github.com/docker/[email protected]
github.com/docker/[email protected]+incompatible
github.com/docker/[email protected]
github.com/docker/[email protected]
github.com/fsnotify/[email protected]
github.com/ghodss/[email protected]
github.com/go-chi/[email protected]+incompatible
github.com/golang/[email protected]
github.com/golang/[email protected]
github.com/golang/[email protected]
github.com/google/[email protected]
github.com/google/[email protected]
github.com/google/[email protected]
github.com/google/[email protected]
github.com/googleapis/gax-go/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-openapi/[email protected]
github.com/go-playground/[email protected]
github.com/go-playground/[email protected]
github.com/go-playground/[email protected]+incompatible
github.com/go-stack/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/[email protected]
github.com/hashicorp/vault/[email protected]
github.com/hashicorp/vault/[email protected]
github.com/howeyc/[email protected]
github.com/inconshreveable/[email protected]
github.com/jedisct1/[email protected]
github.com/josharian/[email protected]
github.com/leodido/[email protected]
github.com/magiconair/[email protected]
github.com/mailru/[email protected]
github.com/mitchellh/[email protected]
github.com/mitchellh/[email protected]
github.com/oklog/[email protected]
github.com/opencontainers/[email protected]
github.com/opencontainers/[email protected]
github.com/opentracing/[email protected]
github.com/pelletier/[email protected]
github.com/peterbourgon/ff/[email protected]
github.com/pierrec/[email protected]+incompatible
github.com/pkg/[email protected]
github.com/PuerkitoBio/[email protected]
github.com/PuerkitoBio/[email protected]
github.com/ryanuber/[email protected]
github.com/sassoftware/[email protected]+incompatible
github.com/segmentio/[email protected]
github.com/sigstore/cosign@
github.com/sigstore/[email protected]
github.com/sigstore/[email protected]
github.com/sigstore/[email protected]
github.com/skratchdot/[email protected]
github.com/spf13/[email protected]
github.com/spf13/[email protected]
github.com/spf13/[email protected]
github.com/spf13/[email protected]
github.com/spf13/[email protected]
github.com/spf13/[email protected]
github.com/subosito/[email protected]
github.com/theupdateframework/[email protected]
github.com/zalando/[email protected]
go.mongodb.org/[email protected]
[email protected]
go.uber.org/[email protected]
go.uber.org/[email protected]
go.uber.org/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
golang.org/x/[email protected]
google.golang.org/[email protected]
google.golang.org/[email protected]
google.golang.org/[email protected]
google.golang.org/[email protected]
gopkg.in/[email protected]
gopkg.in/square/[email protected]
gopkg.in/[email protected]

@nscuro nscuro self-assigned this May 22, 2021
@nscuro
Copy link
Member

nscuro commented May 22, 2021

Note to self: Use go list -deps -json ./... instead of go list -json -m all. This only reports modules (even more granular, packages) that end up in the compiled binary, thus excluding test-only dependencies as well. The inaccurate nature of go list -m all for use cases like this came up here as well.

go list -deps is, in contrast to go list -m, subject to build constraints (tags, GOOS, GOARCH, see golang/go#27900 (comment)). So for multiple build targets, multiple SBOMs may have to be generated. This has to be documented.

Should also remove the necessity of running go mod download and go mod tidy, because go list -deps downloads all modules/packages it visits.

@dlorenc
Copy link
Author

dlorenc commented May 22, 2021

So for multiple build targets, multiple SBOMs may have to be generated. This has to be documented.

I agree - this is unintuitive at first but a go module is not 1:1 with a compiled go "application". You can have many applications under one module, and the module deps are the set of all the applications.

@nscuro nscuro added this to the v0.8.0 milestone May 23, 2021
@nscuro nscuro added the bug Something isn't working label May 23, 2021
@nscuro
Copy link
Member

nscuro commented May 24, 2021

Following up on a discussion in the CDX Slack, we should support the generation of BOMs that either do or don't consider build constraints. Which variant is chosen will depend on what the BOMs are supposed to be used for:

  • Variant 1: distributors will want a BOM that only contains modules that are actually present in the specific binary
    • Each binary built will have its own BOM
  • Variant 2: vendors / maintainers will want a BOM that contains all modules that could potentially be present in any binary
    • Bascially variant 1 w/o considering build constraints
    • These types of BOMs will mostly be used internally for inventory / risk analysis
    • In this case, tracking multiple BOMs for the same project is cumbersome

Variant 1 also isn't really feasable for libraries. Maintainers simply cannot know the set of build constraints their users work with. Distributing a BOM as generated by variant 2 for libraries makes sense however.

Overall, this means that go list -deps only solves this partly (namely, variant 1), and we need another way to support variant 2.

Supporting variant 2 will cover most use cases and will be prioritized.

@nscuro
Copy link
Member

nscuro commented May 24, 2021

go mod why -m does what we need for variant 2, actually:

Why shows a shortest path in the import graph from the main module to
each of the listed packages. If the -m flag is given, why treats the
arguments as a list of modules and finds a path to any package in each
of the modules.

It also doesn't consider build flags, which is exactly what we want.

A few examples with cosign:

$ go mod why -m bazil.org/fuse
# bazil.org/fuse
(main module does not need module bazil.org/fuse)
$ go mod why -m github.com/spf13/viper
# github.com/spf13/viper
github.com/sigstore/cosign/cmd/cosign/cli
github.com/sigstore/rekor/cmd/rekor-cli/app
github.com/spf13/viper

For test-only dependencies, the package list contains a *.test node:

go mod why -m github.com/stretchr/testify
# github.com/stretchr/testify
github.com/sigstore/cosign/cmd/cosign/cli
github.com/sigstore/cosign/cmd/cosign/cli.test
github.com/stretchr/testify/require

The only downside is that we can't differentiate between multiple commands, like we can with go list -deps.
This is because only the shortest path is printed, not all possible paths (see golang/go#41305).

I'm thinking of implementing a short-term solution, where we query go mod why -m with all modules in our module graph to filter out unused dependencies. That would already make a huge difference for larger projects, and reduce the pain until we have an overall satisfying solution ready.

@nscuro
Copy link
Member

nscuro commented May 31, 2021

#20 (comment) has been shipped with v0.8.0 (via #25).

I'm leaving this issue open until we also implemented a solution for variant 1 from #20 (comment).

@nscuro
Copy link
Member

nscuro commented Jun 7, 2021

Following up on #20 (comment): I implemented a PoC that utilizes go list -deps -json ./... and generated a few SBOMs with it to compare with the go mod why -m -vendor approach introduced in v0.8.0. Just to verify that we indeed get very close in terms of accuracy.

An example with [email protected], including a diff, can be found here: https://gist.github.com/nscuro/76a4056963e56d6c3d6b81b0fcd54d89

The SBOMs have been generated with the following commands:

$ go run main.go -module ../proton-bridge -reproducible -output proton-bridge-v1.8.0.bom.xml
$ go run main.go -module ../proton-bridge -reproducible -output proton-bridge-v1.8.0.dist.bom.xml -distribution

Where -distribution is for the approach that uses go list -deps. Commands were run on a Windows x64 machine.

As expected, the SBOM generated with go mod why contains a few (4) more modules, all of them due to build constraints (// +build build_qt, // +build darwin, ...). This also demonstrates that v0.8.1 already reliably filters out test-only dependencies.

@prabhu
Copy link

prabhu commented Jun 12, 2021

There seems to be different interpretations of what an SBoM should or shouldn't include. From my limited experience working with several large and small customers below are some of the things I learnt.

  • Seting scope: optional is better than filtering out test and other optional dependencies completely. Even the information about how the application is getting tested is useful information to have and share
  • Setting aggregate type correctly as per CycloneDX 1.3 specification is often sufficient. Not every tool can generate a complete SBoM so by transparently setting the aggregate type the consumers could make an informed choice to use multiple or alternative tools based on their desired aggregation level

@nscuro
Copy link
Member

nscuro commented Jun 13, 2021

Seting scope: optional is better than filtering out test and other optional dependencies completely. Even the information about how the application is getting tested is useful information to have and share

Agreed, although in many cases today, the tools that consume these SBOMs do not differentiate between required, optional and excluded, which oftentimes just produces a lot of unwanted noise. Which is why most (all?) CycloneDX tools make including optional components... optional.

@prabhu

This comment has been minimized.

This was referenced Jul 31, 2021
@nscuro nscuro added this to the v1.0.0 milestone Jul 31, 2021
@nscuro nscuro removed the bug Something isn't working label Jul 31, 2021
@nscuro
Copy link
Member

nscuro commented Jul 31, 2021

Removed the bug label, as "incorrect / extra data" isn't really true for the use-case that cyclonedx-gomod covers right now, which is module-level SBOMs. The enhancement of supporting application-level SBOMs has been moved to #44.

@nscuro nscuro added the meta label Aug 20, 2021
@nscuro
Copy link
Member

nscuro commented Sep 24, 2021

This has been addressed with the new app command and will be released with v1.0.0 very soon. It has been available since v1.0.0-alpha.2 for a while.

@nscuro nscuro closed this as completed Sep 24, 2021
@dlorenc
Copy link
Author

dlorenc commented Sep 24, 2021

Thanks!

nscuro added a commit that referenced this issue Sep 24, 2021
nscuro added a commit that referenced this issue Sep 27, 2021
* add instructions for goreleaser integration

Signed-off-by: nscuro <[email protected]>

* don't be too verbose on hashes explanation

Signed-off-by: nscuro <[email protected]>

* fix and improve code documentation; update changelog

Signed-off-by: nscuro <[email protected]>

* cli documentation tweaks

Signed-off-by: nscuro <[email protected]>

* reference #20 in changelog

Signed-off-by: nscuro <[email protected]>

* readme tweaks

Signed-off-by: nscuro <[email protected]>

* update examples docker image to include sbom validation

Signed-off-by: nscuro <[email protected]>

* correction

Signed-off-by: nscuro <[email protected]>

* update changelog

Signed-off-by: nscuro <[email protected]>

* improve cli documentation

Signed-off-by: nscuro <[email protected]>

* update cli help in readme

Signed-off-by: nscuro <[email protected]>

* update changelog

Signed-off-by: nscuro <[email protected]>

* update cli docs and readme

Signed-off-by: nscuro <[email protected]>

* regenerate example sboms

Signed-off-by: nscuro <[email protected]>

* fix InvalidSerialNumber test

Signed-off-by: nscuro <[email protected]>

Closes #31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants