Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/kernel/timer/timer_api/test_timeout_abs fails on stm32 boards #32572

Closed
FRASTM opened this issue Feb 23, 2021 · 7 comments · Fixed by #32683
Closed

tests/kernel/timer/timer_api/test_timeout_abs fails on stm32 boards #32572

FRASTM opened this issue Feb 23, 2021 · 7 comments · Fixed by #32683
Assignees
Labels
area: Tests Issues related to a particular existing or missing test area: Timer Timer bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP platform: STM32 ST Micro STM32 priority: medium Medium impact/importance bug

Comments

@FRASTM
Copy link
Collaborator

FRASTM commented Feb 23, 2021

Describe the bug
after The PR : kernel/timeout: Fix off-by-one in absolute timeouts #32505
the test fails on many stm32 boards

START - test_timeout_abs                                                                
                                                                                        
    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/timer/timer_api/src/main.c:725:e
                                                                                        
 FAIL - test_timeout_abs  

To Reproduce
Steps to reproduce the behavior:

  1. west build -p auto -b nucleo_l152re tests/kernel/timer/timer_api
  2. west flash
  3. See error

Expected behavior
test passed

Environment (please complete the following information):

Additional context
reimplement tests/kernel/timer/timer_api #32339

@FRASTM FRASTM added bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 area: Timer Timer area: Tests Issues related to a particular existing or missing test labels Feb 23, 2021
@nashif
Copy link
Member

nashif commented Feb 23, 2021

The failure in details:


    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/timer/timer_api/src/main.c:725: test_timeout_abs: (cap_ticks + rem_ticks + 1 == exp_ticks) || (rem_ticks + cap2_ticks + 1 == exp_ticks) || (INEXACT_MS_CONVERT && (cap_ticks + rem_ticks == exp_ticks)) || (INEXACT_MS_CONVERT && (rem_ticks + cap2_ticks == exp_ticks)) is false

@nashif
Copy link
Member

nashif commented Feb 23, 2021

This is failing on other boards:

INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue
INFO    -  1/16 sam_e70_xplained          tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 7.809s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/sam_e70_xplained/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  2/16 nrf52840dk_nrf52840       tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 11.026s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nrf52840dk_nrf52840/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  3/16 nucleo_f746zg             tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 12.469s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f746zg/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  4/16 nucleo_f429zi             tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 14.157s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f429zi/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  5/16 reel_board                tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 15.312s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/reel_board/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  6/16 nucleo_f207zg             tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Failed (device 16.009s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f207zg/tests/kernel/timer/timer_api/kernel.timer.tickless/handler.log
INFO    -  7/16 disco_l475_iot1           tests/kernel/timer/timer_api/kernel.timer.tickless PASSED (device 15.541s)
INFO    -  8/16 nrf52840dk_nrf52840       tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 24.401s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nrf52840dk_nrf52840/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    -  9/16 sam_e70_xplained          tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 7.576s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/sam_e70_xplained/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    - 10/16 reel_board                tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 8.604s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/reel_board/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    - 11/16 nucleo_f746zg             tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 11.689s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f746zg/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    - 12/16 nucleo_f429zi             tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 10.495s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f429zi/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    - 13/16 nucleo_f207zg             tests/kernel/timer/timer_api/kernel.timer          FAILED Failed (device 11.761s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/nucleo_f207zg/tests/kernel/timer/timer_api/kernel.timer/handler.log
INFO    - 14/16 mimxrt1050_evk            tests/kernel/timer/timer_api/kernel.timer.tickless FAILED Timeout (device 63.781s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/mimxrt1050_evk/tests/kernel/timer/timer_api/kernel.timer.tickless/build.log
INFO    - 15/16 disco_l475_iot1           tests/kernel/timer/timer_api/kernel.timer          PASSED (device 14.803s)
 INFO    - 16/16 mimxrt1050_evk            tests/kernel/timer/timer_api/kernel.timer          FAILED Timeout (device 63.587s)
ERROR   - see: /home/nashif/zephyrproject/zephyr/twister-out/mimxrt1050_evk/tests/kernel/timer/timer_api/kernel.timer/build.log

INFO    - 2 of 16 test configurations passed (12.50%), 14 failed, 0 skipped with 0 warnings in 169.30 seconds
INFO    - In total 224 test cases were executed, 0 skipped on 8 out of total 327 platforms (2.45%)
INFO    - 16 test configurations executed on platforms, 0 test configurations were only built.

@nashif nashif added the priority: medium Medium impact/importance bug label Feb 23, 2021
@hakehuang
Copy link
Collaborator

hakehuang commented Feb 24, 2021

also fails on nxp platforms RT1xxx and RT600, but Ok in Kinetis(FRDM_K64f). @MaureenHelm

andyross pushed a commit to andyross/zephyr that referenced this issue Feb 25, 2021
The test_timeout_abs case had baked in similar mistakes to the
off-by-one in the absolute timer implementation.  FOR THE RECORD:

If you have an absolute timeout expiration set for a tick value "N",
and the current time returned by k_uptime_ticks() is "T", then the
time returned (at the same moment) by any of the *_remaining_ticks()
APIs must ALWAYS AND FOREVER BE EXACTLY "N - T" (also: "N - T > 0"
always, until the moment the kernel ISR hands off control to the first
timeout handler expiring at that tick).

The tick math is exact.  No slop is needed on any systems, no matter
whether their clocks divide by milliseconds or not.

The only gotcha is that we need to be sure that the calls don't
interleave with a real time tick advance, which we do here with a
simple retry loop.

But, about slop... This patch also includes a related fix for the
test_sleep_abs().  On an intel_adsp (which has 50 kHz ticks, a
comparatively slow idle resume and interrupt entry, and even has two
CPUs to mess with latency measurements) I would occasionally see the
k_sleep() take more than a tick to wake up from the interrupt handler
until the return to application code.  Add some real time slop there
(just 100us) to handle systems like this.

Fixes zephyrproject-rtos#32572

Signed-off-by: Andy Ross <[email protected]>
@andyross
Copy link
Contributor

Potential fix up at #32683 . There was indeed a bug in the test exposed by the recent absolute timeout fix on high tick rate systems (i.e. hardware, not CI). I caught it happening on the adsp I was working on, so I'm fairly confident this is the issue. If those with affected systems could test and report results, I'd be appreciative.

Though I will note that the fact that all those systems are systick-based makes me a tiny bit worried there may be a driver bug in play too.

nashif pushed a commit that referenced this issue Feb 26, 2021
The test_timeout_abs case had baked in similar mistakes to the
off-by-one in the absolute timer implementation.  FOR THE RECORD:

If you have an absolute timeout expiration set for a tick value "N",
and the current time returned by k_uptime_ticks() is "T", then the
time returned (at the same moment) by any of the *_remaining_ticks()
APIs must ALWAYS AND FOREVER BE EXACTLY "N - T" (also: "N - T > 0"
always, until the moment the kernel ISR hands off control to the first
timeout handler expiring at that tick).

The tick math is exact.  No slop is needed on any systems, no matter
whether their clocks divide by milliseconds or not.

The only gotcha is that we need to be sure that the calls don't
interleave with a real time tick advance, which we do here with a
simple retry loop.

But, about slop... This patch also includes a related fix for the
test_sleep_abs().  On an intel_adsp (which has 50 kHz ticks, a
comparatively slow idle resume and interrupt entry, and even has two
CPUs to mess with latency measurements) I would occasionally see the
k_sleep() take more than a tick to wake up from the interrupt handler
until the return to application code.  Add some real time slop there
(just 100us) to handle systems like this.

Fixes #32572

Signed-off-by: Andy Ross <[email protected]>
@MaureenHelm
Copy link
Member

Potential fix up at #32683 . There was indeed a bug in the test exposed by the recent absolute timeout fix on high tick rate systems (i.e. hardware, not CI). I caught it happening on the adsp I was working on, so I'm fairly confident this is the issue. If those with affected systems could test and report results, I'd be appreciative.

@hakehuang can you verify the fix on RT1xxx and RT600?

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 1, 2021

Potential fix up at #32683 . There was indeed a bug in the test exposed by the recent absolute timeout fix on high tick rate systems (i.e. hardware, not CI). I caught it happening on the adsp I was working on, so I'm fairly confident this is the issue. If those with affected systems could test and report results, I'd be appreciative.

@hakehuang can you verify the fix on RT1xxx and RT600?
this fix works for rt1xxx and rt6xx. @MaureenHelm

'
*** Booting Zephyr OS build zephyr-v2.5.0-579-g3b8ea186fd1a ***
Running test suite timer_api
===================================================================
START - test_time_conversions
PASS - test_time_conversions
===================================================================
START - test_timer_duration_period
PASS - test_timer_duration_period
===================================================================
START - test_timer_restart
PASS - test_timer_restart
===================================================================
START - test_timer_period_0
PASS - test_timer_period_0
===================================================================
START - test_timer_expirefn_null
PASS - test_timer_expirefn_null
===================================================================
START - test_timer_periodicity
PASS - test_timer_periodicity
===================================================================
START - test_timer_status_get
PASS - test_timer_status_get
===================================================================
START - test_timer_status_get_anytime
PASS - test_timer_status_get_anytime
===================================================================
START - test_timer_status_sync
PASS - test_timer_status_sync
===================================================================
START - test_timer_k_define
PASS - test_timer_k_define
===================================================================
START - test_timer_user_data
PASS - test_timer_user_data
===================================================================
START - test_timer_remaining
PASS - test_timer_remaining
===================================================================
START - test_timeout_abs
PASS - test_timeout_abs
===================================================================
START - test_sleep_abs
PASS - test_sleep_abs
===================================================================
Test suite timer_api succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL

@MaureenHelm
Copy link
Member

this fix works for rt1xxx and rt6xx.

thanks @hakehuang !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Tests Issues related to a particular existing or missing test area: Timer Timer bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP platform: STM32 ST Micro STM32 priority: medium Medium impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants