Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Numerics.Vectors.Tests: Assertion failed 'intrinsicId == NI_Vector128_GetElement' #64918

Closed
sbomer opened this issue Feb 7, 2022 · 15 comments · Fixed by #64960
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@sbomer
Copy link
Member

sbomer commented Feb 7, 2022

The System.Numerics.Vectors.Tests are failing in our rolling build, on:

  • net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open
  • net7.0-Linux-Release-x64-CoreCLR_checked-(Alpine.314.Amd64.Open)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:alpine-3.14-helix-amd64-20210910135833-1848e19
  • net7.0-windows-Release-x86-CoreCLR_checked-Windows.10.Amd64.Open
  • net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open

build, log

Starting:    System.Numerics.Vectors.Tests (parallel test collections = on, max threads = 2)

Assert failure(PID 9962 [0x000026ea], Thread: 9979 [0x26fb]): Assertion failed 'intrinsicId == NI_Vector128_GetElement' in '<>c__DisplayClass159_0`1[UInt64][System.UInt64]:<TestIndexerOutOfRange>b__0():this' during 'Generate code' (IL size 18)

    File: /__w/1/s/src/coreclr/jit/hwintrinsiccodegenxarch.cpp Line: 1285
    Image: /datadisks/disk1/work/AF440984/p/dotnet

It appears the failure was introduced within this range of commits: dbd4cbb...d936a66
(see the builds before and after). @AndyAyersMS would you be able to take a look?

Runfo Tracking Issue: system.numerics.vectors.tests work item

Build Definition Kind Run Name Console Core Dump Test Results Run Client
1599103 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1599103 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-(Alpine.314.Amd64.Open)[email protected]/dotnet-buildtools/prereqs:alpine-3.14-helix-amd64-20210910135833-1848e19 console.log core dump runclient.py
1599103 runtime Rolling net7.0-windows-Release-x86-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1599103 runtime Rolling net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1598707 runtime PR 64861 net7.0-windows-Release-x86-CoreCLR_release-Windows.10.Amd64.Server19H1.ES.Open console.log core dump runclient.py
1598707 runtime PR 64861 net7.0-windows-Release-x86-CoreCLR_release-Windows.7.Amd64.Open console.log core dump runclient.py
1598566 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-(Alpine.314.Amd64.Open)[email protected]/dotnet-buildtools/prereqs:alpine-3.14-helix-amd64-20210910135833-1848e19 console.log core dump runclient.py
1598566 runtime Rolling net7.0-windows-Release-x86-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1598566 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1598566 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1598566 runtime Rolling net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1597732 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1597732 runtime Rolling net7.0-Linux-Release-x64-CoreCLR_checked-(Alpine.314.Amd64.Open)[email protected]/dotnet-buildtools/prereqs:alpine-3.14-helix-amd64-20210910135833-1848e19 console.log core dump runclient.py
1597732 runtime Rolling net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1597732 runtime Rolling net7.0-windows-Release-x86-CoreCLR_checked-Windows.10.Amd64.Open console.log core dump runclient.py
1596919 runtime PR 64851 net7.0-OSX-Debug-x64-Mono_release-OSX.1200.Amd64.Open console.log
1596919 runtime PR 64851 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log
1596919 runtime PR 64851 net7.0-OSX-Debug-x64-CoreCLR_release-OSX.1200.Amd64.Open console.log
1596889 runtime PR 63958 net7.0-OSX-Debug-x64-Mono_release-OSX.1200.Amd64.Open console.log
1596889 runtime PR 63958 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log
1596889 runtime PR 63958 net7.0-OSX-Debug-x64-CoreCLR_release-OSX.1200.Amd64.Open console.log
1596826 runtime PR 64567 net7.0-OSX-Debug-x64-Mono_release-OSX.1200.Amd64.Open console.log
1596826 runtime PR 64567 net7.0-OSX-Debug-x64-CoreCLR_release-OSX.1200.Amd64.Open console.log
1596826 runtime PR 64567 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log
1596805 runtime PR 64330 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log
1596805 runtime PR 64330 net7.0-OSX-Debug-x64-CoreCLR_release-OSX.1200.Amd64.Open console.log
1596805 runtime PR 64330 net7.0-OSX-Debug-x64-Mono_release-OSX.1200.Amd64.Open console.log
1596759 runtime PR 64806 net7.0-OSX-Debug-x64-Mono_release-OSX.1200.Amd64.Open console.log
1596759 runtime PR 64806 net7.0-OSX-Debug-x64-CoreCLR_release-OSX.1200.Amd64.Open console.log
1596759 runtime PR 64806 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log
1596708 runtime PR 64748 net7.0-OSX-Debug-x64-CoreCLR_checked-OSX.1200.Amd64.Open console.log

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
4 10 10
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Feb 7, 2022
@ghost
Copy link

ghost commented Feb 7, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

The System.Numerics.Vectors.Tests are failing in our rolling build, on:

  • net7.0-Linux-Release-x64-CoreCLR_checked-Ubuntu.1804.Amd64.Open
  • net7.0-Linux-Release-x64-CoreCLR_checked-(Alpine.314.Amd64.Open)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:alpine-3.14-helix-amd64-20210910135833-1848e19
  • net7.0-windows-Release-x86-CoreCLR_checked-Windows.10.Amd64.Open
  • net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open

build, log

Starting:    System.Numerics.Vectors.Tests (parallel test collections = on, max threads = 2)

Assert failure(PID 9962 [0x000026ea], Thread: 9979 [0x26fb]): Assertion failed 'intrinsicId == NI_Vector128_GetElement' in '<>c__DisplayClass159_0`1[UInt64][System.UInt64]:<TestIndexerOutOfRange>b__0():this' during 'Generate code' (IL size 18)

    File: /__w/1/s/src/coreclr/jit/hwintrinsiccodegenxarch.cpp Line: 1285
    Image: /datadisks/disk1/work/AF440984/p/dotnet

It appears the failure was introduced within this range of commits: dbd4cbb...d936a66
(see the builds before and after). @AndyAyersMS would you be able to take a look?

Author: sbomer
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@AndyAyersMS
Copy link
Member

@AndyAyersMS would you be able to take a look?

Yes.

@AndyAyersMS AndyAyersMS self-assigned this Feb 7, 2022
@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Feb 7, 2022
@AndyAyersMS
Copy link
Member

cc @tannergooding who may be more familiar with that assert (likely introduced in a2b7648)

@tannergooding
Copy link
Member

Has this been failing sporadically and is only recently being surfaced?

a2b7648 is nearly a year old at this point.

@AndyAyersMS
Copy link
Member

Would guess it is from the changes I made in #64843 either having a bug or exposing one...

@tannergooding
Copy link
Member

The assert here is interesting.

We have presumably are getting a Vector256.GetElement with a constant op2

LowerHWIntrinsicGetElement should have broken this apart into either GetLower().GetElement(cns) or ExtractVector128().GetElement(cns - (Count / 2))

Perhaps this is something to do with the specialization handling when the vector is already in memory and forward substitution allowing a new pattern?

@AndyAyersMS
Copy link
Member

Thanks. Keep me posted if you look any deeper. I probably won't be able to get to this until later today sometime....

@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Feb 8, 2022
@AndyAyersMS
Copy link
Member

Ok, taking a look now.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Feb 8, 2022

@AndyAyersMS
Copy link
Member

LowerHWIntrinsicGetElement should have broken this apart

op1 is containable memory, so we defer handling to codegen, per LowerHWIntrinsicGetElement

N010 ( 14, 16) [000008] ---XG-------              *  HWINTRINSIC long   ulong GetElement
N004 (  6,  5) [000001] ---XG--N----              +--*  IND       simd32
N003 (  4,  3) [000013] -c----------              |  \--*  LEA(b+8)  byref
N001 (  3,  2) [000000] ------------              |     \--*  LCL_VAR   ref    V00 this
N008 (  1,  1) [000003] ------------              \--*  CNS_INT   int    4 vector element count

@AndyAyersMS
Copy link
Member

Later in codegen we have op1 in a register, so evidently it wasn't contained after all. Wonder if this is because we're not doing the safety check in LowerHWIntrinsicGetElement -- we're not actually containing there, just thinking that we will?

N018 ( 14, 16) [000008] ---XG-------              *  HWINTRINSIC long   ulong GetElement REG rax
N008 (  6,  5) [000001] D--XG--N----              +--*  IND       simd32 REG mm0
N006 (  4,  3) [000013] ------------              |  \--*  LEA(b+8)  byref  REG rax
N004 (  3,  2) [000000] ------------              |     \--*  LCL_VAR   ref    V00 this          rax REG rax
N016 (  1,  1) [000003] -c----------              \--*  CNS_INT   int    0 vector element count REG NA

@AndyAyersMS
Copy link
Member

Wonder if this is because we're not doing the safety check in LowerHWIntrinsicGetElement.

Yep. PR up shortly.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Feb 8, 2022
Missing call to IsSafeToContainMem was causing us to mistakenly think
an operand was going to be contained when it wasn't.

Fixes dotnet#64918
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 8, 2022
AndyAyersMS added a commit that referenced this issue Feb 8, 2022
Missing call to IsSafeToContainMem was causing us to mistakenly think
an operand was going to be contained when it wasn't.

Fixes #64918
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 8, 2022
@elinor-fung
Copy link
Member

@AndyAyersMS do you know why PR didn't catch this - is it because we run libraries tests in Debug for PR and libraries tests in Release for rolling? Is there some test hole that needs filling here?

@AndyAyersMS
Copy link
Member

No, I'm not sure how it slipped through.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants