Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2: no such instruction #1

Closed
txje opened this issue May 22, 2015 · 3 comments
Closed

AVX2: no such instruction #1

txje opened this issue May 22, 2015 · 3 comments

Comments

@txje
Copy link

txje commented May 22, 2015

Hey Jeff,
This looks like a phenomenal piece of work and I'd love to try it out. I am running into problems during the build. For reference, I'm trying to run this on a RHEL cluster with potentially heterogeneous architectures. I get the following error:

make
...
CCLD libparasail_sse41_table.la
CC src/libparasail_avx2_memory_la-memory_avx2.lo
/tmp/cc4hi1vp.s: Assembler messages:
/tmp/cc4hi1vp.s:46: Error: no such instruction: vmovdqa %ymm0,(%rdi,%rax)' /tmp/cc4hi1vp.s:56: Error: no such instruction:vzeroupper'

I have little to no experience with vector programming, so I'm afraid I don't know where to start troubleshooting. Any ideas you might have will be helpful.

Jeremy

@jeffdaily
Copy link
Owner

I'm looking at this right now.

I was trying to be flexible with the compilation and let some of the advanced compilers go ahead and generate code for AVX2 even if the build system did not support running such code. I was trying to consider an RPM maintainer wanting to produce a single RPM that supported all architectures.

If you could send me your config.log to my [email protected] email (I don't think github issues allow for attachments?) then I can see how the configure tests performed for the AVX2 portion. It is interesting that configure determined your compiler supported AVX2 -- I probably did not test thoroughly enough.

As a short-term work-around, you can disable the compilation of AVX2 code (or any unsupported instruction set) using the following configure parameters. They should be pretty much self-documenting. I use the parameter "choke", though any nonsense flag should work so long as a compiler will barf when trying to process it.

configure SSE2_FLAGS=choke SSE41_FLAGS=choke AVX2_CFLAGS=choke

The above parameters correspond to the following portion of the configure.ac file:

AC_MSG_CHECKING([for SSE2 flags])
AS_IF([test "x$SSE2_CFLAGS" = x],
      [AS_CASE(["$vendor"],
               [clang],    [SSE2_CFLAGS="-msse2"],
               [gnu],      [SSE2_CFLAGS="-march=core2"],
               [intel],    [SSE2_CFLAGS="-march=core2"])])
AC_MSG_RESULT([$SSE2_CFLAGS])
AC_SUBST([SSE2_CFLAGS])

AC_MSG_CHECKING([for SSE4.1 flags])
AS_IF([test "x$SSE41_CFLAGS" = x],
      [AS_CASE(["$vendor"],
               [clang],    [SSE41_CFLAGS="-msse4"],
               [gnu],      [SSE41_CFLAGS="-march=corei7"],
               [intel],    [SSE41_CFLAGS="-march=corei7"])])
AC_MSG_RESULT([$SSE41_CFLAGS])
AC_SUBST([SSE41_CFLAGS])

AC_MSG_CHECKING([for AVX2 flags])
AS_IF([test "x$AVX2_CFLAGS" = x],
      [AS_CASE(["$vendor"],
               [clang],    [AVX2_CFLAGS="-mavx2"],
               [gnu],      [AVX2_CFLAGS="-march=core-avx2"],
               [intel],    [AVX2_CFLAGS="-march=core-avx2"])])
AC_MSG_RESULT([$AVX2_CFLAGS])
AC_SUBST([AVX2_CFLAGS])

@jeffdaily
Copy link
Owner

I just pushed 45e2b22 which adds the offending code you came across to the configure test for AVX2 support. Without using my work-around above, please try this commit and rerun configure.

I checked it on my macbook using gcc 4.8 from macports and I got the configure test to fail with the same error message you reported -- so I think it works now, correctly detecting AVX2 as not available.

@txje
Copy link
Author

txje commented May 22, 2015

Perfect. My build completed successfully with 45e2b22. I'll keep you posted as I try to run some sequences through.

@txje txje closed this as completed May 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants