Skip to content

Commit

Permalink
feat: Runfile and java artifact support for Bazel Builder (#2362)
Browse files Browse the repository at this point in the history
cc: @mihaimaruseac 

This will provide support for artifacts that also need runfiles. Users
can use a flag, `needs-runfiles`, to also package the artifact alongside
the runfiles Bazel generates if the user deems it as necessary.

Also, support for Java artifacts is made available by packaging the JARs
to be standalone through a format in Bazel called `_deploy.jar`.
Alongside the `_deploy.jar`, there will be a modified `run-script` that
allows the users that download the artifact to run the run-script by
using a flag called `local_javabin`, where they put the path to their
own java bin such that it is utilized by the run-script. This run-script
is generated by Bazel from a template and later modified in the
`build.sh` in the internal part of the builder to add this flag for the
users. More information is available on the readme.

Java targets will automatically be converted to their `_deploy.jar` with
this. Three flags are used for users that have java targets:

`includes-java`: if true then adds an additional flag to build command
as well as rule for local java repo in WORKSPACE in order to utilize the
`--singlejar` capability of run-script for `_deploy.jar` such that the
remote jdk does not need to be included in runfiles. Doing it like this
prevents massive bloat when attesting.

`user-java-distribution` and `user-java-version`: let the user specify
the exact java they want to use to build

When users run the run-script they will include an additional flag
`local_javabin` which they will set to their local javabin that the
run-script will utilize to run the `_deploy.jar`

A combination of `bazel query` and `bazel cquery` were used to resolved
edge cases with the implementation.

---------

Signed-off-by: Noah Elzner <[email protected]>
Signed-off-by: Noah Elzner <[email protected]>
Signed-off-by: Ian Lewis <[email protected]>
Signed-off-by: laurentsimon <[email protected]>
Signed-off-by: laurentsimon <[email protected]>
Co-authored-by: Ian Lewis <[email protected]>
Co-authored-by: laurentsimon <[email protected]>
  • Loading branch information
3 people authored Jul 13, 2023
1 parent 205e9bc commit ba8a119
Show file tree
Hide file tree
Showing 4 changed files with 260 additions and 20 deletions.
31 changes: 31 additions & 0 deletions .github/workflows/builder_bazel_slsa3.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,37 @@ on:
required: false
type: string
default: ""
needs-runfiles:
description: >
A boolean input that if true will package the artifact's runfiles along with the artifact.
If this flag is not set, then the artifact itself will be uploaded without any corresponding runfiles
that Bazel generates for it during the build process.
required: false
type: boolean
default: false
includes-java:
description: >
A boolean input that if true will add the build flag "--java_runtime_version=myjdk".
If this flag is not set, then the build process will not be able to generate a JAR that can run
standalone through adding the flag --singlejar on the run script for the JAR that is also generated.
required: false
type: boolean
default: false
user-java-distribution:
description: The Java distribution to setup to build artifacts with.
default: "oracle"
type: string
required: false
user-java-version:
description: >
The Java version to setup to build artifacts with.
Supports major versions 8, 11, 16, 17, and other versions also.
default: "17"
type: string
required: false

outputs:
provenance-download-name:
Expand Down
66 changes: 64 additions & 2 deletions internal/builders/bazel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ workflow the "Bazel builder" from now on.
- [Development status](#development-status)
- [Generating Provenance](#generating-provenance)
- [Getting Started](#getting-started)
- [Runfile Support](#runfile-support)
- [Java Artifact Support (and Caveats)](#java-artifact-support-and-caveats)
- [Referencing the Bazel builder](#referencing-the-bazel-builder)
- [Private Repositories](#private-repositories)
- [Supported Triggers](#supported-triggers)
Expand Down Expand Up @@ -93,12 +95,72 @@ jobs:
The `targets` are a set of space separated build targets to be built. Each target must include the `//` workspace root identifier and package target identifier (`:your_target`). Because of this each target should be of the form `//path/from/root/to/target:your_target`.

Targets can also be referred to with general glob patterns such as `//src/...` or `//src/internal:all`. Note however, that support for artifacts that
require runfiles is still currently in development and not available at this time. Progress for runfile support is currently being tracked [here](https:/slsa-framework/slsa-github-generator/issues/2332).
Targets can also be referred to with general glob patterns such as `//src/...` or `//src/internal:all`. Generic glob patterns that have an intersection are allowed as well.

Once the targets are built, the Bazel builder creates a folder for the artifacts
and another for the provenance attestations which are uploaded as artifacts to the workflow run.

### Runfile Support

If the artifact(s) built need the runfiles generated along with it to function properly, then they can be added with the artifact in the attestation. In the following resuable workflow call, the flag `needs-runfiles` will be set to `true`
in order to package the artifacts with their runfiles.

```yaml
jobs:
build:
permissions:
id-token: write # For signing
contents: read # For repo checkout.
actions: read # For getting workflow run info.
if: startsWith(github.ref, 'refs/tags/')
uses: slsa-framework/slsa-github-generator/.github/workflows/[email protected]
with:
targets: "//src:fib //src:hello"
flags: "--strip=always"
needs-runfiles: true
```

In the artifact folder that gets uploaded to Github, with `needs-runfiles` set to true, there will be a folder for each artifact which contains the artifact and the folder of its runfiles.
With the `needs-runfiles` flag set to true, each target specified in the workflow call will be packaged with their respective runfiles.

### Java Artifact Support (and Caveats)

If the targets being built includes Java targets, then the flag `includes-java` must be set to true. Additionally, if a specific distribution and version of Java is needed,
that can be designated through the `user-java-distribution` and `user-java-version` flags. Note that the default Java distribution is Oracle and default Java version is 17.
For more info on configuring the Java distribution and version go [here](https:/actions/setup-java). This flag usage can be seen in the following resuable workflow call:

```yaml
jobs:
build:
permissions:
id-token: write # For signing
contents: read # For repo checkout.
actions: read # For getting workflow run info.
if: startsWith(github.ref, 'refs/tags/')
uses: slsa-framework/slsa-github-generator/.github/workflows/[email protected]
with:
targets: "//src:fib //src:hello"
flags: "--strip=always"
includes-java: true
user-java-distribution: "oracle"
user-java-version: "17"
```

Each Java target will be outputed in its own directory inside the artifact folder that gets uploaded. Inside each respective artifact directory will be a JAR that can be ran on its own using the run-script that is
packaged with it. For instance if there is a Java target named Main it would be uploaded as its own directory with tree looking like the following:

├── Main <br />
│   ├── Main # This is the run-script <br />
│   └── Main_deploy.jar <br />

Each Java target, whether specified as in the targets input as a `_deploy.jar` or not, will be built as a [_deploy.jar](https://bazel.build/reference/be/java) which contains all classes found by classloader and native libraries for dependencies.
Since the artifact is built on a Github Runner, the run-script has the VM's Java bin path hardcoded in. However, the run-script has been modified to include an additional flag, `--local_javabin` to change the Java Bin path to the user's. To run the JAR using
the run-script the `--singlejar` flag must be specified to signal to the run-script that the JAR is a `_deploy.jar`. Additionally, `--local_javabin` must be set to the path of the user's Java Bin to run it. Therefore running the JAR would look like the following:

`./Main --singlejar --local_javabin="path/to/user/bin/java"`

Note that Java targets do not need to have the `needs-runfiles` flag to be true in order to create the _deploy.jar and run-script for it.

### Referencing the Bazel builder

At present, the builder **MUST** be referenced by a tag of the form `@vX.Y.Z`,
Expand Down
9 changes: 9 additions & 0 deletions internal/builders/bazel/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,19 @@ runs:
id: bazelisk
uses: bazelbuild/setup-bazelisk@95c9bf48d0c570bb3e28e57108f3450cd67c1a44 # v2.0.0

- name: Setup Java
id: java
uses: actions/setup-java@5ffc13f4174014e2d4d4572b3d74c3fa61aeb2c2 # v3.11.0
with:
distribution: "${{ fromJson(inputs.slsa-workflow-inputs).user-java-distribution }}"
java-version: "${{ fromJson(inputs.slsa-workflow-inputs).user-java-version }}"

- id: build
env:
TARGETS: ${{ fromJson(inputs.slsa-workflow-inputs).targets }}
FLAGS: ${{ fromJson(inputs.slsa-workflow-inputs).flags }}
NEEDS_RUNFILES: ${{ fromJson(inputs.slsa-workflow-inputs).needs-runfiles }}
INCLUDES_JAVA: ${{ fromJson(inputs.slsa-workflow-inputs).includes-java }}
shell: bash
run: ./../__TOOL_ACTION_DIR__/build.sh

Expand Down
174 changes: 156 additions & 18 deletions internal/builders/bazel/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,33 +16,171 @@

set -euo pipefail

# TODO(Issue #2331): switch copy to binaries to a temp dir
mkdir binaries

# Transfer flags and targets to their respective arrays
IFS=' ' read -r -a build_flags <<< "${FLAGS}"
IFS=' ' read -r -a build_targets <<< "${TARGETS}"

# Build with respect to entire arrays of flags and targets
bazel build "${build_flags[@]}" "${build_targets[@]}"
# If the targets includes Java targets, include Java build flag
# and add Github Runner Java rule to WORKSPACE
if [[ "${INCLUDES_JAVA}" == "true" ]]
then
build_flags+=("--java_runtime_version=myjdk")

java_rule="local_java_repository(
name = \"myjdk\",
java_home = \"$JAVA_HOME\",
)"

# Echo the configuration for the local Github Runner java into the root WORKSPACE file.
echo "load(\"@bazel_tools//tools/jdk:local_java_repository.bzl\", \"local_java_repository\")" >> ./WORKSPACE
echo "$java_rule" >> ./WORKSPACE
fi

# Use associative array as a set to increase efficency in avoiding double copying the target.
# Useful in generic glob patterns.
declare -A targets_set

# Use associative array as a set to increase efficency in avoiding double copying the target
declare -A files_set
################################################
# #
# Build Target Set #
# #
################################################

# Using target string, copy artifact(s) to binaries dir
for curr_target in "${build_targets[@]}"; do
# Get file(s) generated from build with respect to the target
bazel_generated=$(bazel cquery --output=starlark --starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" "$curr_target" 2>/dev/null)
for input in "${build_targets[@]}"; do

# Uses a Starlark expression to pass new line seperated list of file(s) into the set of files
while read -r file; do
# Key value is target path, value we do not care about and is set to constant "1"
files_set["${file}"]="1"
done <<< "$bazel_generated"
# Using bazel query extracts all targets from a glob pattern.
# Thus we can change Java targets to their _deploy.jar target.
for target in $(bazel query "$input"); do

# Check to see if the target is a Java target. If it is the output is a Java target.
# Note: targets that already have the _deploy.jar suffix will have no output from the query
output=$(bazel query "kind(java_binary, $target)" 2>/dev/null)

# If there is a Java target without _deploy.jar suffix, add suffix, build and add to target set.
if [[ -n "$output" ]]
then
bazel build "${build_flags[@]}" "${target}_deploy.jar"
targets_set["${target}_deploy.jar"]="1"
else
# Build target regularly.
bazel build "${build_flags[@]}" "$target"
targets_set["$target"]="1" # A set of unique targets
fi
done
done

# Copy set of unique targets to binaries. Without !, it would give values not keys
# TODO(Issue #2331): switch copy to binaries to a temp dir
for file in "${!files_set[@]}"; do
# Remove the symbolic link and copy
cp -L "$file" ./binaries
################################################
# #
# Copy Needed Artifacts To Binaries Dir #
# #
################################################

for curr_target in "${!targets_set[@]}"; do

# Removes everything up to and including the first colon
# "//src/internal:fib" --> "fib"
binary_name=${curr_target#*:}

################################################
# #
# Logic for Java Targets #
# #
################################################

# If the target name includes _deploy.jar it is a Java target.
if [[ "$binary_name" == *"_deploy.jar"* ]]
then
# Uses _deploy.jar as a field seperator and grabs the field before it.
run_script_name=$(echo "$binary_name" | awk -F'_deploy.jar' '{print $1}')

# Create dir for artifact and its runfiles
mkdir "./binaries/$run_script_name"

# Get the absolute path to output of Java JAR artifact.
bazel_generated=$(bazel cquery --output=starlark --starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" "$curr_target" 2>/dev/null)

# Copy JAR to artifact-specific dir in ./binaries and remove symbolic links.
file="$bazel_generated"
cp -Lr "$file" "./binaries/$run_script_name"

# Get the path the to run-script associated with the {$curr_target}_deploy.jar
# If the user inputted the path to their local JAVABIN insert that into the run-script to define it.
# Inputting a local path to JAVABIN is needed or else run-script will not work as it points to Github Runner JAVABIN
run_script_path=$(echo "$file" | awk -F'_deploy.jar' '{print $1}')

# This adds an additional flag to the the run-script for the Java target which sets the Java bin
# to the user input. This allows users that download the binaries from the Github workflow to be able
# to run the run-script themselves, which would not be possible as it is either set to the Github Runner VM Java bin path
# if no flag to USER_LOCAL_JAVABIN is passed in their workflow or to the path passed in their flag.
awk -v n=66 -v s=' --local_javabin=*) USER_JAVA_BIN=( "${1#--local_javabin=}" ) ;;' 'NR == n {print s} {print}' "$run_script_path" > temp_file && mv -f temp_file "$run_script_path"

# Updates Java Bin in run-script after the flags get proccessed
awk -v n=127 -v s='' 'NR == n {print s} {print}' "$run_script_path" > temp_file && mv -f temp_file "$run_script_path"
awk -v n=128 -v s='if [[ -n $USER_JAVA_BIN ]]; then JAVABIN=$USER_JAVA_BIN; fi' 'NR == n {print s} {print}' "$run_script_path" > temp_file && mv -f temp_file "$run_script_path"

cp -L "$run_script_path" "./binaries/$run_script_name"

################################################
# #
# Logic for Non-Java Targets #
# #
################################################

else

################################################
# #
# Logic for Runfile-Needing Targets #
# #
################################################

if [[ "${NEEDS_RUNFILES}" == "true" ]]
then
# Get file(s) generated from build with respect to the target
bazel_generated=$(bazel cquery --output=starlark --starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" "$curr_target" 2>/dev/null)

# Create dir for artifact and its runfiles
mkdir "./binaries/$binary_name"

# Uses a Starlark expression to pass new line seperated list of file(s) into the set of files
while read -r path_to_artifact; do

# Copy generated artifact from absolute path from bazel cquery
cp -L "$path_to_artifact" "./binaries/$binary_name"

# if runfiles dir exists, copy runfiles into artifact's dir
if [[ -d "${path_to_artifact}.runfiles" ]]
then
path_to_target_runfiles="${path_to_artifact}.runfiles"
cp -Lr "$path_to_target_runfiles" "./binaries/$binary_name"
cd "./binaries/$binary_name/$binary_name.runfiles/"

# Unneeded and can contain unwanted symbolic links
rm -rf _main/external
rm -rf MANIFEST
rm -rf _repo_mapping

# Go back to the old dir
cd -
fi
done <<< "$bazel_generated"

################################################
# #
# Logic for NON-Runfile-Needing Targets #
# #
################################################
else
# Get file(s) generated from build with respect to the target
bazel_generated=$(bazel cquery --output=starlark --starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" "$curr_target" 2>/dev/null)

# Uses a Starlark expression to pass new line seperated list of file(s) into the set of files
while read -r file; do
cp -L "$file" ./binaries
done <<< "$bazel_generated"
fi
fi
done

0 comments on commit ba8a119

Please sign in to comment.