-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"llvm-profdata merge" doesn't work in 13.0 #50966
Comments
Can we get a bisect on this? |
mentioned in issue #51489 |
I have the same problem even without running multiple processes. This only happens on my CI server, not locally. |
The deadline for requesting fixes for the release has passed. This bug is being removed from the LLVM 13.0.1 release milestone. If you have a fix or think this bug is important enough to block the release, please explain why in a comment and add the bug back to the LLVM 13.0.1 release milestone. |
I have identified the core cause of the problem. The malformed data is generated for programs running with shared libraries (both for shared libs linked directly or loaded with I'm not sure this was intended or not but this scenario was working with clang 12. The workaround (or proper fix?) is the add |
I also have the same problem. I run my program multiple times with different arguments, I have written a hash function to hash the arguments passed to the function. In my case, the generated It gives no error on Windows, but on Linux, it gives some error about out-of-bounds count (I can provide the exact error if you want).
Yes, I have a shared library that is not instrumented. Adding %m doesn't work for me as I mentioned above. |
As far as I can tell, this remains broken in every version since 13 (my own project gets this error on every version from 16.0.5 down, so we're stuck using LLVM 12.0.1 for now) |
I did a bisect, the first bad commit is e50a388. |
Thanks for the report. Is there a small reproducer that you can provide? |
@KevinHake How did you test your bisect? That could be a good reproduction. |
I haven't had time to try to create a minimal project that has the same issue. Nor have I made a hello world to see if the most basic executable actually works (tho I imagine the automated tests would've caught that long ago if it were broken). We're loading a few other dlls for graphics that were not compiled with clang, not sure if that's related. The default.profraw output between good and bad versions didn't look obviously wrong (nor parse-able at all by eye), but I might try llvm-profdata merge in the debugger to see if it gives a better idea what triggers it to give up. |
Is it possible for you to provide the profiles (*.profraw files), so I can take a look? |
It's my client's code, I don't have rights to share unfortunately. I'll look into paring it down into something minimal, it probably won't take long. |
Here is the small reproducer: int main() { return 0; } My environment: Operating System: Microsoft Windows 10 Enterprise, v10.0.19045 N/A Build 19045 Ninja v1.11.1 (https:/ninja-build/ninja/releases) @echo off
path C:\Program Files\Ninja;%PATH%
path C:\Program Files\CMake\bin;%PATH%
call "C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Auxiliary\Build\vcvars64.bat"
REM last working commit
set LLVMVER=LLVM-2021-07-23-120b187
REM first bad commit:
REM set LLVMVER=LLVM-2021-07-23-e50a388
echo Configuring LLVM build:
cmake --version
cmake -G Ninja -DLLVM_ENABLE_PROJECTS=clang;lld;libc -DLLVM_ENABLE_RUNTIMES= -DCMAKE_INSTALL_PREFIX=c:\src\llvm\%LLVMVER% -DCMAKE_BUILD_TYPE=Release ../llvm
rem Looks like some kind of concurrency bug makes the pool fail when getting a new piece to compile... this loop worked without giving up on threads
echo Building %LLVMVER%
:build
cmake --build .
if %ERRORLEVEL% neq 0 (
echo Build failed! Let's try again...
goto build
)
echo Build finished successfully!
echo Installing...
cmake --install .
endlocal
pause It took ~1hr to build each version (my install dir c:\src\llvm\ has several LLVM versions, including the last good and first bad commits) I build the simple reproducer this way: %LLVM_BIN_DIR%\clang-cl.exe -fprofile-instr-generate -fcoverage-mapping /c /Fomain.obj main.c
%LLVM_BIN_DIR%\lld-link.exe main.obj , where I then reproduce the issue by running main.exe, which produces |
In the RFC for e50a388 (https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html), it sounds like the new build ID was targeting ELF binaries..? Given it seems to break coverage for even the simplest PE/COFF binary in Windows, it's worth checking if the same occurs for Mach-O on a Mac (I don't have a Mac). For that matter I haven't tested in Linux with an ELF file either. If I'm not the only one that repros this with |
Extended Description
Doing something along the lines of
export LLVM_PROFILE_FILE=xyz-%p.profile.d
export CFLAGS="-O2 -fprofile-instr-generate"
export CXXFLAGS="-O2 -fprofile-instr-generate"
./configure
make
Run the generated binaries in some expected ways, e.g. "make check"
llvm-profdata merge --output=xyz.profile xyz-*.profile.d
consistently (obviously with PIDs varying) results in
warning: xyz-670250.profile.d: malformed instrumentation profile data
warning: xyz-670257.profile.d: malformed instrumentation profile data
error: no profile can be merged
The *.profile.d files look ok at a first glance, and "file" recognizes them as "LLVM raw profile data, version 7".
Looks like only "llvm-profdata merge" is broken, using -fprofile-instr-use= seems to be ok.
This is a regression from 12.x.
The text was updated successfully, but these errors were encountered: