RISCV32 QEMU illegal instruction exception / floating point support #34026

mehrdadh · 2021-04-06T03:30:04Z

Describe the bug
I'm running a machine learning model on qemu_riscv32 and I ran into illegal instruction exception for some reason. I wasn't sure if this is a bug and if so whether it is related to zephyr or qemu, however I'll try to provide as much as information to get a better understanding.

Here is the assembly code that I'm running:

 0x8000bd74 <+0>:	lw	a4,0(a0)
   0x8000bd76 <+2>:	lw	a5,8(a0)
   0x8000bd78 <+4>:	lw	a0,0(a4)
   0x8000bd7a <+6>:	lw	a1,0(a5)
   0x8000bd7c <+8>:	li	a3,0
   0x8000bd7e <+10>:	j	0x8000bda4 <fused_nn_pad_layout_transform+48>
   0x8000bd80 <+12>:	addi	a5,a3,-2
   0x8000bd84 <+16>:	li	a2,27
   0x8000bd86 <+18>:	bgeu	a2,a5,0x8000bdae <fused_nn_pad_layout_transform+58>
=> 0x8000bd8a <+22>:	fmv.w.x	fa5,zero
   0x8000bd8e <+26>:	slli	a5,a3,0x5
   0x8000bd92 <+30>:	add	a5,a5,a4
   0x8000bd94 <+32>:	slli	a5,a5,0x2
   0x8000bd96 <+34>:	add	a5,a5,a1
   0x8000bd98 <+36>:	fsw	fa5,0(a5)
   0x8000bd9a <+38>:	addi	a4,a4,1
   0x8000bd9c <+40>:	li	a5,31
   0x8000bd9e <+42>:	bge	a5,a4,0x8000bd80 <fused_nn_pad_layout_transform+12>
   0x8000bda2 <+46>:	addi	a3,a3,1
   0x8000bda4 <+48>:	li	a5,31
   0x8000bda6 <+50>:	blt	a5,a3,0x8000bde0 <fused_nn_pad_layout_transform+108>
   0x8000bdaa <+54>:	li	a4,0
   0x8000bdac <+56>:	j	0x8000bd9c <fused_nn_pad_layout_transform+40>
   0x8000bdae <+58>:	li	a5,1
   0x8000bdb0 <+60>:	bge	a5,a4,0x8000bdd4 <fused_nn_pad_layout_transform+96>
   0x8000bdb4 <+64>:	li	a5,29
   0x8000bdb6 <+66>:	blt	a5,a4,0x8000bdda <fused_nn_pad_layout_transform+102>
   0x8000bdba <+70>:	li	a5,28
   0x8000bdbc <+72>:	mul	a5,a3,a5
   0x8000bdc0 <+76>:	add	a5,a5,a4
   0x8000bdc2 <+78>:	lui	a2,0x40000
   0x8000bdc6 <+82>:	addi	a2,a2,-58 # 0x3fffffc6
   0x8000bdca <+86>:	add	a5,a5,a2
   0x8000bdcc <+88>:	slli	a5,a5,0x2
   0x8000bdce <+90>:	add	a5,a5,a0
   0x8000bdd0 <+92>:	flw	fa5,0(a5)
   0x8000bdd2 <+94>:	j	0x8000bd8e <fused_nn_pad_layout_transform+26>
   0x8000bdd4 <+96>:	fmv.w.x	fa5,zero
   0x8000bdd8 <+100>:	j	0x8000bd8e <fused_nn_pad_layout_transform+26>
   0x8000bdda <+102>:	fmv.w.x	fa5,zero
   0x8000bdde <+106>:	j	0x8000bd8e <fused_nn_pad_layout_transform+26>
   0x8000bde0 <+108>:	li	a0,0
   0x8000bde2 <+110>:	ret

My code breaks on line 0x8000bd8a and then the mcause register is loaded with value 0x02 which translates to illegal instruction. Please let me know if you need more information about this.

Impact
We are working on adding riscv32 support to apache TVM and this issue currently blocks us for supporting certain operations.

Environment

OS: Linux
Zephyr 2.5.0
Zephyr SDK 0.12.3
QEMU 5.1.0

The text was updated successfully, but these errors were encountered:

mehrdadh · 2021-04-06T03:41:34Z

The same function on qemu_riscv64 doesn't generate the fmv.w.x assembly code:

=> 0x000000008000b446 <+0>:	ld	a4,0(a0)
   0x000000008000b448 <+2>:	ld	a5,8(a0)
   0x000000008000b44a <+4>:	ld	a0,0(a4)
   0x000000008000b44c <+6>:	ld	a1,0(a5)
   0x000000008000b44e <+8>:	li	a3,0
   0x000000008000b450 <+10>:	j	0x8000b476 <fused_nn_pad_layout_transform+48>
   0x000000008000b452 <+12>:	addiw	a5,a3,-2
   0x000000008000b456 <+16>:	li	a2,27
   0x000000008000b458 <+18>:	bgeu	a2,a5,0x8000b480 <fused_nn_pad_layout_transform+58>
   0x000000008000b45c <+22>:	li	a2,0
   0x000000008000b460 <+26>:	slliw	a5,a3,0x5
   0x000000008000b464 <+30>:	addw	a5,a5,a4
   0x000000008000b466 <+32>:	slli	a5,a5,0x2
   0x000000008000b468 <+34>:	add	a5,a5,a1
   0x000000008000b46a <+36>:	sw	a2,0(a5)
   0x000000008000b46c <+38>:	addiw	a4,a4,1
   0x000000008000b46e <+40>:	li	a5,31
   0x000000008000b470 <+42>:	bge	a5,a4,0x8000b452 <fused_nn_pad_layout_transform+12>
   0x000000008000b474 <+46>:	addiw	a3,a3,1
   0x000000008000b476 <+48>:	li	a5,31
   0x000000008000b478 <+50>:	blt	a5,a3,0x8000b4ac <fused_nn_pad_layout_transform+102>
   0x000000008000b47c <+54>:	li	a4,0
   0x000000008000b47e <+56>:	j	0x8000b46e <fused_nn_pad_layout_transform+40>
   0x000000008000b480 <+58>:	li	a5,1
   0x000000008000b482 <+60>:	bge	a5,a4,0x8000b4a0 <fused_nn_pad_layout_transform+90>
   0x000000008000b486 <+64>:	li	a5,29
   0x000000008000b488 <+66>:	blt	a5,a4,0x8000b4a6 <fused_nn_pad_layout_transform+96>
   0x000000008000b48c <+70>:	li	a5,28
   0x000000008000b48e <+72>:	mulw	a5,a5,a3
   0x000000008000b492 <+76>:	addw	a5,a5,a4
   0x000000008000b494 <+78>:	addiw	a5,a5,-58
   0x000000008000b498 <+82>:	slli	a5,a5,0x2
   0x000000008000b49a <+84>:	add	a5,a5,a0
   0x000000008000b49c <+86>:	lw	a2,0(a5)
   0x000000008000b49e <+88>:	j	0x8000b460 <fused_nn_pad_layout_transform+26>
   0x000000008000b4a0 <+90>:	li	a2,0
   0x000000008000b4a4 <+94>:	j	0x8000b460 <fused_nn_pad_layout_transform+26>
   0x000000008000b4a6 <+96>:	li	a2,0
   0x000000008000b4aa <+100>:	j	0x8000b460 <fused_nn_pad_layout_transform+26>
   0x000000008000b4ac <+102>:	li	a0,0
   0x000000008000b4ae <+104>:	ret

galak · 2021-04-06T22:31:43Z

You mean need to add k_float_enable(_current, 0);

@katsuster was making some changes to the floating point support recently and guessing he'll be able to know better how this is suppose to work.

mehrdadh · 2021-04-06T22:55:08Z

Thanks to @galak suggestion, I added k_float_enable(_current, 0) and CONFIG_FPU_SHARING config flag and updated my zephyr to master branch and which solved some other issues that I had with double division operation.

I still have this issue though. Also, it would be great if @katsuster can suggest a solution with zephyr 2.5, if possible, since we are using that release version in our pipeline.
Thanks!

mehrdadh · 2021-04-06T23:57:02Z

Update:
Adding @galak suggestions fixes this issue as well.
@katsuster please let me know if you know any workaround with zephyr 2.5.
Thanks!

katsuster · 2021-04-07T16:32:17Z

@mehrdadh
Thanks for report. I think most easiest way is calling k_float_enable(_current, 0) before using FPU (that is suggested by @galak).
But unfortunately Zephyr 2.5-branch has not yet had k_float_enable() because this is very recently changing. So we need to take a little complex way.

Do you want to run floating point instruction on main thread?
If you don't have strong reason to run on main thread, a child thread can use FPU with K_FP_REGS option like as:

K_THREAD_DEFINE(thread_main, STACKSIZE, thread_func, NULL, NULL, NULL,
		PRIORITY, K_FP_REGS, 0);

I hope this helps.

mehrdadh · 2021-04-08T18:30:56Z

@katsuster Thanks for the information.
This approach would be kinda complicated for our setup. We will update ours once 2.6 is out. Thanks for the great work on supporting riscv boards!

mehrdadh added the bug The issue is a bug, or the PR is fixing a bug label Apr 6, 2021

carlescufi assigned katsuster Apr 6, 2021

carlescufi added area: QEMU QEMU Emulation area: RISCV RISCV Architecture (32-bit & 64-bit) labels Apr 6, 2021

galak added the priority: medium Medium impact/importance bug label Apr 6, 2021

galak changed the title ~~RISCV32 QEMU illegal instruction exception~~ RISCV32 QEMU illegal instruction exception / floating point support Apr 6, 2021

mehrdadh mentioned this issue Apr 7, 2021

[microTVM] Zephyr: RISCV support for Zephyr QEMU RISCV-32/64 apache/tvm#7804

Merged

mehrdadh closed this as completed Apr 8, 2021

mehrdadh mentioned this issue Oct 22, 2021

[microTVM][Zephyr] Enable RISCV Tests on QEMU CI apache/tvm#9325

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RISCV32 QEMU illegal instruction exception / floating point support #34026

RISCV32 QEMU illegal instruction exception / floating point support #34026

mehrdadh commented Apr 6, 2021 •

edited

Loading

mehrdadh commented Apr 6, 2021

galak commented Apr 6, 2021

mehrdadh commented Apr 6, 2021

mehrdadh commented Apr 6, 2021

katsuster commented Apr 7, 2021

mehrdadh commented Apr 8, 2021

RISCV32 QEMU illegal instruction exception / floating point support #34026

RISCV32 QEMU illegal instruction exception / floating point support #34026

Comments

mehrdadh commented Apr 6, 2021 • edited Loading

mehrdadh commented Apr 6, 2021

galak commented Apr 6, 2021

mehrdadh commented Apr 6, 2021

mehrdadh commented Apr 6, 2021

katsuster commented Apr 7, 2021

mehrdadh commented Apr 8, 2021

mehrdadh commented Apr 6, 2021 •

edited

Loading