USB MSC has some sort of race condition #2322

byteit101 · 2023-11-18T19:25:56Z

Operating System

Linux

Board

Raspberry Pi PICO (RP2040)

Firmware

https:/byteit101/test-tinyusb-pico

What happened ?

panic/assert/exit calls sometimes, but not always, on writes (I think?).

I have confirmed it on Debian 10, Linux 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

I haven't seen it on a "blank" flash (no filesystem)

I have confirmed it with G++ 8.3 and G++ 12.2 (the latter requires uncommenting target_link_options(${PROJECT_NAME} PRIVATE "LINKER:--no-warn-rwx-segments") in the CMake file)

I have reported this downstream as raspberrypi/pico-sdk#1523

How to reproduce ?

Ensure you have a 1.5.1 PICO SDK setup in your environment.

git clone https:/byteit101/test-tinyusb-pico
cd test-tinyusb-pico
mkdir build
cd build
cmake ..
make
picotool load msc_pico_failure.elf -f
picotool reboot -f
lsblk # find the drive that just appeared and is 1MB
sudo mkfs.vfat -F12 -n "TinyUSB Tst" -S 4096 /dev/sdYOURDRIVE

In one terminal, tio /dev/ttyACM0 (or some other auto-connecting serial console)
Now, watch dmesg -w or such, and repeat picotool reboot -f. Sometimes it will be ok and you will see the Test Flash load, and remain loaded, and tio serial connection remain connected.

[18441889.657799] scsi 9:0:0:0: Direct-Access     TUSB     Test Flash       1.0  PQ: 0 ANSI: 2
[18441889.658432] sd 9:0:0:0: Attached scsi generic sg8 type 0
[18441889.658676] sd 9:0:0:0: [sdh] 256 4096-byte logical blocks: (1.05 MB/1.00 MiB)

Othertimes, it will first do that, then after a few seconds, it will disconnect serial, and reload into the RP2 bootloader drive. This is the failure condition:

[18441891.961605] scsi 10:0:0:0: Direct-Access     RPI      RP2              3    PQ: 0 ANSI: 2
[18441891.962053] sd 10:0:0:0: Attached scsi generic sg8 type 0
[18441891.962397] sd 10:0:0:0: [sdh] 262144 512-byte logical blocks: (134 MB/128 MiB)

Debug Log as txt file (LOG/CFG_TUSB_DEBUG=2)

Using tinyusb 0.12, I got this log sequence, culminating in an assertion failure:

...snip...
USBD Setup Received 80 06 00 01 00 00 40 00 
  Get Descriptor Device
  Queue EP 80 with 18 bytes ...
USBD Xfer Complete on EP 80 with 18 bytes
  Queue EP 00 with 0 bytes ...
  Short packet on buffer 0 with 0 bytes
Completed transfer of 0 bytes on ep 0 out
USBD Xfer Complete on EP 00 with 0 bytes
BUS RESET
USBD Bus Reset : Full Speed
  Short packet on buffer 0 with 18 bytes
Completed transfer of 18 bytes on ep 0 in

USBD Setup Received 00 05 1E 00 00 00 00 00 
  Set Address
USBD Xfer Complete on EP 80 with 18 bytes
usbd_control_xfer_cb 171: ASSERT FAILED

In contrast, successful Set Address looks like

...snip

USBD Setup Received 00 05 10 00 00 00 00 00 
  Set Address
  Short packet on buffer 0 with 0 bytes
Completed transfer of 0 bytes on ep 0 in
USBD Xfer Complete on EP 80 with 0 bytes

...snip

The reordering, combined with the fact that it's highly irregular, sometimes failing for 5 reboots in a row, then succeeding for 50%, then succeeding for 10 reboots in a row, makes me think this is some kind of race condition in the MSC TUD driver.

The stack trace was usbd_control.c:171, called from usbd.c:562

I updated TinyUSB to 0.15.0, and now get:

*** PANIC ***

ep 0 in was already available

Screenshots

No response

I have checked existing issues, dicussion and documentation

I confirm I have checked existing issues, dicussion and documentation.

The text was updated successfully, but these errors were encountered:

jncronin · 2024-03-19T14:22:20Z

I think I am also seeing this issue on a custom STM32H7 board.

For a variety of reasons the host resets the bus (in my case simply sending 0 bytes to a read10 request triggers this with a windows host after a while).

The DCD_EVENT_BUS_RESET message is immediately followed by a DCD_EVENT_XFER_COMPLETE on an IN endpoint, and the subsequent TU_ASSERT(driver,) fails.

Olstyle · 2024-10-10T14:32:00Z

I can also reproduce what @jncronin is describing. If I have high load on my STM2U585 MCU, probably some response is handled too late, which results in a bus reset from the host. Since this DCD_EVENT_BUS_RESET resets the device info in _usbd_dev , but the event queue is not cleared, the already present DCD_EVENT_XFER_COMPLETE is read in the next round and tries to find drivers in an empty _usbd_dev , which fails into the Assert.

EDT: the following kind of works with FreeRTOS (since osal_semaphore_reset expands to xQueueReset)

--- a/dwc2/device/usbd.c
+++ b/dwc2/device/usbd.c
@@ -585,6 +585,7 @@ void tud_task_ext(uint32_t timeout_ms, bool in_isr) {
       case DCD_EVENT_BUS_RESET:
         TU_LOG_USBD(": %s Speed\r\n", tu_str_speed[event.bus_reset.speed]);
         usbd_reset(event.rhport);
+        osal_semaphore_reset(_usbd_q);
         _usbd_dev.speed = event.bus_reset.speed;
         break;

Of course there will still be a reenumeration etc. but the device keeps on living and Windows seems to find it again as well.

Fixes hathach#2322

byteit101 added the Bug 🐞 label Nov 18, 2023

byteit101 mentioned this issue Nov 18, 2023

USB MSC has some sort of race condition raspberrypi/pico-sdk#1523

Closed

Olstyle added a commit to Olstyle/tinyusb that referenced this issue Oct 14, 2024

make sure transfers after reset/before setup are ignored

98174d9

Fixes hathach#2322

Olstyle linked a pull request Oct 14, 2024 that will close this issue

make sure transfers to device after reset/before setup are ignored #2840

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USB MSC has some sort of race condition #2322

USB MSC has some sort of race condition #2322

byteit101 commented Nov 18, 2023 •

edited

Loading

jncronin commented Mar 19, 2024

Olstyle commented Oct 10, 2024 •

edited

Loading

USB MSC has some sort of race condition #2322

USB MSC has some sort of race condition #2322

Comments

byteit101 commented Nov 18, 2023 • edited Loading

Operating System

Board

Firmware

What happened ?

How to reproduce ?

Debug Log as txt file (LOG/CFG_TUSB_DEBUG=2)

Screenshots

I have checked existing issues, dicussion and documentation

jncronin commented Mar 19, 2024

Olstyle commented Oct 10, 2024 • edited Loading

byteit101 commented Nov 18, 2023 •

edited

Loading

Olstyle commented Oct 10, 2024 •

edited

Loading