Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/6.0] CompilerServer: fatal error - server reports different hash version than build task on linux-musl when using mono-backed runtime #76543

Closed
ayakael opened this issue Oct 3, 2022 · 8 comments

Comments

@ayakael
Copy link
Contributor

ayakael commented Oct 3, 2022

Description

Shared compilation with Roslyn fails on mono-backed runtime on linux-musl platforms. This only happens on .net6, thus suggesting it is simply a fix needing backporting.

Reproduction Steps

Within Alpine Linux linux-musl-x64 environment:

./bootstrap/dotnet build Src/Newtonsoft.Json/Newtonsoft.Json.csproj /v:diag
  • let it rip

Note that bootstrap tar was built using this aports script and #68424 to allow building with /p:PrimaryRuntimeFlavor=Mono

Expected behavior

Shared compilation should pass without issue

Actual behavior

Death

Regression?

Not a regression, per se. .net7 with mono has issues on linux-musl but this is not one of them.

Known Workarounds

Building with /p:UseSharedCompilation=false

Configuration

  • .NET 6.0.109
  • Alpine Linux Edge
  • x64 and s390x have been tested to be broken
  • Reproducible across multiple linux-musl platforms

Other information

No response

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 3, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@uweigand
Copy link
Contributor

uweigand commented Oct 5, 2022

I've been seeing problems with the Roslyn compile server on .NET 6 because cross-process named mutexes are not implemented in the Mono runtime, leading to various race conditions with the compile server. See this issue: dotnet/roslyn#57002

The PR attached to that issue fixes the problem (by using a different synchronization mechanism), and has been added to Roslyn for .NET 7, but the .NET 6 version still needs this as a separate patch.

Not sure if that is exactly the same problem (the symptoms were a bit different), but might be worthwhile to try. (Note the patch needs to go into the Roslyn used by the host SDK as well as in the Roslyn that is being built via source-build.)

@ayakael
Copy link
Contributor Author

ayakael commented Oct 5, 2022

I've been seeing problems with the Roslyn compile server on .NET 6 because cross-process named mutexes are not implemented in the Mono runtime, leading to various race conditions with the compile server. See this issue: dotnet/roslyn#57002

The PR attached to that issue fixes the problem (by using a different synchronization mechanism), and has been added to Roslyn for .NET 7, but the .NET 6 version still needs this as a separate patch.

Not sure if that is exactly the same problem (the symptoms were a bit different), but might be worthwhile to try. (Note the patch needs to go into the Roslyn used by the host SDK as well as in the Roslyn that is being built via source-build.)

The patch is added when crosscompiling using the prebuilt SDK from x64 to s390x. Is this enough, or does the SDK used for crosscompiling need it as well?

@uweigand
Copy link
Contributor

uweigand commented Oct 5, 2022

The patch is added when crosscompiling using the prebuilt SDK from x64 to s390x.

Ah, I see. That's probably not it, then.

Is this enough, or does the SDK used for crosscompiling need it as well?

That's an Intel-hosted SDK? If so, it is presumably using CoreCLR, not Mono, and therefore does not need the patch.

@ayakael
Copy link
Contributor Author

ayakael commented Oct 5, 2022

The patch is added when crosscompiling using the prebuilt SDK from x64 to s390x.

Ah, I see. That's probably not it, then.

Is this enough, or does the SDK used for crosscompiling need it as well?

That's an Intel-hosted SDK? If so, it is presumably using CoreCLR, not Mono, and therefore does not need the patch.

Right! I wonder if maybe installer isn't picking up the built roslyn packages? Is there a way I can confirm that the build sdk has the fixes?

@ayakael
Copy link
Contributor Author

ayakael commented Oct 5, 2022

Following is used to build installer:

_installer() {
	msg "[$(date)] Building installer version $_installerver"
	cd "$builddir"/src/installer

	local args="
		-c Release
		/p:OSName=linux-musl
		/p:HostOSName=linux-musl
		/p:Architecture=$_dotnet_target
		/p:CoreSetupBlobRootUrl=file://$_downloaddir/
		/p:DotnetToolsetBlobRootUrl=file://$_downloaddir/
		/p:GitCommitHash=$(cat ./.git/HEAD)
		/p:GitCommitCount=$(grep GitCommitCount "$builddir"/git-info/installer.props | sed -E 's|</?GitCommitCount>||g' | tr -d ' ')
		/p:PublicBaseURL=file://$_downloaddir/
		/p:UseSharedCompilation=false
		"
	if [ "$_installerver" != "${_installerver##*-}" ]; then local args="$args /p:VersionSuffix=${_installerver##*-}"; fi
	if [ "$_dotnet_target" = "x86" ]; then local args="$args /p:DISABLE_CROSSGEN=True"; fi
	./build.sh $args
	mkdir  -p "$_downloaddir"/installer/$_installerver
	cp artifacts/packages/*/*/dotnet-sdk-$_pkgver_macro*.tar.gz "$_downloaddir"/installer/$_installerver
}

@uweigand
Copy link
Contributor

uweigand commented Oct 5, 2022

The installer picks up the roslyn DLLs as part of the SDK tarball via the DotnetToolsetBlobRootUrl. That's created by the sdk package build, which in turn picks up the roslyn DLLs from NuGet - which is why you need to override NuGet.config in sdk to refer to the local directory where you've placed the packages generated from the roslyn build.

If any of that goes wrong, the build process can silently miss picking up the modified roslyn DLLs and just fall back to the versions on nuget.org ... To verify that the correct version was picked up, I'd simply compare the DLLs in your roslyn artifacts directory with the ones in the final SDK tarbal -- they should be identical.

@ayakael
Copy link
Contributor Author

ayakael commented Oct 6, 2022

The installer picks up the roslyn DLLs as part of the SDK tarball via the DotnetToolsetBlobRootUrl. That's created by the sdk package build, which in turn picks up the roslyn DLLs from NuGet - which is why you need to override NuGet.config in sdk to refer to the local directory where you've placed the packages generated from the roslyn build.

If any of that goes wrong, the build process can silently miss picking up the modified roslyn DLLs and just fall back to the versions on nuget.org ... To verify that the correct version was picked up, I'd simply compare the DLLs in your roslyn artifacts directory with the ones in the final SDK tarbal -- they should be identical.

Awesome, this was it! Indeed, SDK wasn't picking the roslyn packages because of setting /p:ArcadeBuildFromSource=true. It was adding a nupkg cache line in NuGet.conf above local, thus overriding. Closing.

@ayakael ayakael closed this as completed Oct 6, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Oct 6, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants