Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using hci_usb with Bluez 5.55 or 5.58 #34593

Closed
1am opened this issue Apr 27, 2021 · 23 comments
Closed

Using hci_usb with Bluez 5.55 or 5.58 #34593

1am opened this issue Apr 27, 2021 · 23 comments
Assignees
Labels
area: Bluetooth area: USB Universal Serial Bus

Comments

@1am
Copy link

1am commented Apr 27, 2021

Describe the bug

I'm trying to use hci_usb on nRF52840DK to connect peripheral_dis on a nRF52833 DK using bluetoothctrl (or even old hcitool + gatttool). The boards behave correctly, I can see the peripheral advertise, HCI enumerate and I can even connect both. What happens later is problematic and is consistent across more devices and own firmware I've tried - the example above is narrowed down to Zephyr only examples and reference hardware development kits.

After the device HCI connects to peripheral there's nothing more really happening and after 10-20seconds the connection is terminated by remote user.

I'm attaching a video with a demonstration of how it looks in action.

When doing the same with a plain old CSR based dongle everything works just fine. Initially I blamed my own hardware and firmware but eventually I went to the simplest case of 2 DKs with 2 FW examples. I am able to reproduce it on Zephyr master and v2.5.0 branches. Didn't go further in history yet.

To Reproduce
Steps to reproduce the behavior:

  1. Connect first board and run: cd .../zephyr/samples/bluetooth/hci_usb && west nrf52840dk_nrf52832 && west flash
  2. Disconnect first, connect second and run: cd .../zephyr/samples/bluetooth/peripheral_dis && west build -b nrf52833dk_nrf52833 && west flash
  3. Open btmon in one terminal window
  4. Open bluetoothctrl in second terminal window
  5. Connect and try to discover the device services and characteristics

Expected behavior

I expected to have services and characteristics discovered and be able to read and write them the same way as I do when using a CSR BLE dongle

Impact

Blocker, can't move on with implementing own software relying on BLE Linux support. Right now it works ok for CSR but it has limit of 5 concurrent connections and I plan to use nRF52840 to handle more and be embedded on the target platform which will later run the software. Currently testing on standard bluez tools to get any potential issues with custom software out of the way.

Logs and console output

btmon output from single connection-to-connection-terminated cycle:

< HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7                                #19 44.041732
        Type: Passive (0x00)
        Interval: 60.000 msec (0x0060)
        Window: 30.000 msec (0x0030)
        Own address type: Random (0x01)
        Filter policy: Ignore not in white list (0x01)
> HCI Event: Command Complete (0x0e) plen 4                                               #20 44.042650
      LE Set Scan Parameters (0x08|0x000b) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2                                    #21 44.042724
        Scanning: Enabled (0x01)
        Filter duplicates: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4                                               #22 44.043651
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 19                                                 #23 44.113694
      LE Advertising Report (0x02)
        Num reports: 1
        Event type: Connectable undirected - ADV_IND (0x00)
        Address type: Random (0x01)
        Address: F4:84:A0:B0:37:41 (Static)
        Data length: 7
        Flags: 0x06
          LE General Discoverable Mode
          BR/EDR Not Supported
        16-bit Service UUIDs (complete): 1 entry
          Device Information (0x180a)
        RSSI: -31 dBm (0xe1)
< HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2                                    #24 44.113795
        Scanning: Disabled (0x00)
        Filter duplicates: Disabled (0x00)
> HCI Event: Command Complete (0x0e) plen 4                                               #25 44.114660
      LE Set Scan Enable (0x08|0x000c) ncmd 1
        Status: Success (0x00)
< HCI Command: LE Create Connection (0x08|0x000d) plen 25                                 #26 44.114747
        Scan interval: 60.000 msec (0x0060)
        Scan window: 60.000 msec (0x0060)
        Filter policy: White list is not used (0x00)
        Peer address type: Random (0x01)
        Peer address: F4:84:A0:B0:37:41 (Static)
        Own address type: Random (0x01)
        Min connection interval: 30.00 msec (0x0018)
        Max connection interval: 50.00 msec (0x0028)
        Connection latency: 0 (0x0000)
        Supervision timeout: 420 msec (0x002a)
        Min connection length: 0.000 msec (0x0000)
        Max connection length: 0.000 msec (0x0000)
> HCI Event: Command Status (0x0f) plen 4                                                 #27 44.116535
      LE Create Connection (0x08|0x000d) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 31                                                 #28 44.221681
      LE Enhanced Connection Complete (0x0a)
        Status: Success (0x00)
        Handle: 0
        Role: Master (0x00)
        Peer address type: Random (0x01)
        Peer address: F4:84:A0:B0:37:41 (Static)
        Local resolvable private address: 00:00:00:00:00:00 (Non-Resolvable)
        Peer resolvable private address: 00:00:00:00:00:00 (Non-Resolvable)
        Connection interval: 50.00 msec (0x0028)
        Connection latency: 0 (0x0000)
        Supervision timeout: 420 msec (0x002a)
        Master clock accuracy: 0x07
@ MGMT Event: Device Connected (0x000b) plen 20                                      {0x0001} 44.221748
        LE Address: F4:84:A0:B0:37:41 (Static)
        Flags: 0x00000000
        Data length: 7
        Flags: 0x06
          LE General Discoverable Mode
          BR/EDR Not Supported
        16-bit Service UUIDs (complete): 1 entry
          Device Information (0x180a)
< HCI Command: LE Read Remote Used Features (0x08|0x0016) plen 2                          #29 44.221837
        Handle: 0
> HCI Event: LE Meta Event (0x3e) plen 4                                                  #30 44.222552
      LE Channel Selection Algorithm (0x14)
        Handle: 0
        Algorithm: #2 (0x01)
> HCI Event: Command Status (0x0f) plen 4                                                 #31 44.223538
      LE Read Remote Used Features (0x08|0x0016) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 12                                                 #32 44.323540
      LE Read Remote Used Features (0x04)
        Status: Success (0x00)
        Handle: 0
        Features: 0x7f 0x41 0x01 0x00 0x00 0x00 0x00 0x00
          LE Encryption
          Connection Parameter Request Procedure
          Extended Reject Indication
          Slave-initiated Features Exchange
          LE Ping
          LE Data Packet Length Extension
          LL Privacy
          LE 2M PHY
          Channel Selection Algorithm #2
          Minimum Number of Used Channels Procedure
< ACL Data TX: Handle 0 flags 0x00 dlen 7                                                 #33 44.323962
      ATT: Exchange MTU Request (0x02) len 2
        Client RX MTU: 517
< HCI Command: Disconnect (0x01|0x0006) plen 3                                            #34 76.375238
        Handle: 0
        Reason: Remote User Terminated Connection (0x13)
> HCI Event: Command Status (0x0f) plen 4                                                 #35 76.376780
      Disconnect (0x01|0x0006) ncmd 1
        Status: Success (0x00)
> HCI Event: Disconnect Complete (0x05) plen 4                                            #36 76.473653
        Status: Success (0x00)
        Handle: 0
        Reason: Connection Terminated By Local Host (0x16)
@ MGMT Event: Device Disconnected (0x000c) plen 8                                    {0x0001} 76.473712
        LE Address: F4:84:A0:B0:37:41 (Static)
        Reason: Connection terminated by local host (0x02)

Below is a video showing what happens when these logs get generated:

zephyr-hci_usb-to_zephyr_dis.mp4

Environment (please complete the following information):

  • OS: Ubuntu 20 5.8.0-50-generic, bluez 5.55 or 5.58 (both exhibit the same behaviour)
  • Toolchain: Zephyr SDK: Zephyr from master or v2.5.0 branch, zephyr-sdk-0.12.3

Additional context

None

@1am 1am added the bug The issue is a bug, or the PR is fixing a bug label Apr 27, 2021
@sjanc
Copy link
Collaborator

sjanc commented Apr 27, 2021

Your device is not responding for ATT Exchange MTU Request and after 30 seconds BlueZ timeouts and terminates link.

I'd suggest checking if ATT request was received on slave device, if yes fix it, if not there might be some IOP issue on LL and this would have to be investigated further...

@1am
Copy link
Author

1am commented Apr 27, 2021

What could be the reason for peripheral_dis not accepting ATT Exchange MTU Request? There aren't many options to fix or break this sample code. It's not that this packet is not received due to heavy RF interference or anything.
Similar thing happens with other firmware I've tried to connect to (based on nRF5 SDK) and initially I blamed it for some problems, though it there were no issues when connecting from phones. On top of that CSR dongle connects and works just fine while the hci_usb always fails to work for me, it doesn't really matter what is on the peripheral side I try to connect to.

@carlescufi
Copy link
Member

@1am peripheral_dis should definitely respond to that Request, so my guess is that the Request is never going over the air or the response is not reaching the dongle controller.
If you have a spare dongle or any other Nordic DK, could you please sniff the exchange with the Nordic Sniffer and attach it?

@1am
Copy link
Author

1am commented Apr 27, 2021

Of course. Attaching a zip with pcapng (Github doesn't accept raw pcapng but zip worked) between F4:02:11:4D:7C:FF as with HCI_USB and F4:84:A0:B0:37:41 with peripheral_dis.

hci-f402114d7cff-to-dis-f484a0b03741.zip

The one below will be better I guess:

hci-f402114d7cff-to-dis-f484a0b03741-20210427-1444.zip

@ioannisg ioannisg added the priority: medium Medium impact/importance bug label Apr 27, 2021
@carlescufi
Copy link
Member

Thanks @1am. This packet:

< ACL Data TX: Handle 0 flags 0x00 dlen 7                                                 #33 44.323962
      ATT: Exchange MTU Request (0x02) len 2
        Client RX MTU: 517

Never goes over the air. This might be an issue in the USB driver then.

nrf52840dk_nrf52832

This is a typo right?

@carlescufi
Copy link
Member

Since you are using a DK, could you also try the UART transport instead of the USB one and see if that makes it work?

@1am
Copy link
Author

1am commented Apr 27, 2021

This is a typo right?

Yes, a typo of course. I meant nRF52840dk_nrf52833

I've tried HCI UART just now and as when I tested roughly a year ago everything seemed to be very ok on the DK. The only difference from the link you shared is that I had to attach my nRF52840dk_nrf52833 using btattach -B /dev/ttyACM0 -S 1000000 -P h4 or it didn't work at all - interface showed up but was always down and hciconfig hci2 up resulted in "Can't init device hci2: Operation not supported (95)" error.

I really wanted to move towards the HCI USB implementation as I've had very bizarre issue with HCI UART and FT232 which still remains an unsolved mystery for me, though clearly caused by FT232 along the way. Unfortunately my only options are either USB of nRF52840 or FT232 doing USB-UART.

@carlescufi
Copy link
Member

I've tried HCI UART just now and as when I tested roughly a year ago everything seemed to be very ok on the DK. The only difference from the link you shared is that I had to attach my nRF52840dk_nrf52833 using btattach -B /dev/ttyACM0 -S 1000000 -P h4 or it didn't work at all - interface showed up but was always down and hciconfig hci2 up resulted in "Can't init device hci2: Operation not supported (95)" error.

Could you please send a Pull Request updating the documentation?

I really wanted to move towards the HCI USB implementation as I've had very bizarre issue with HCI UART and FT232 which still remains an unsolved mystery for me, though clearly caused by FT232 along the way. Unfortunately my only options are either USB of nRF52840 or FT232 doing USB-UART.

HCI USB should work, so this is probably a bug in the USB layer somewhere. Since this seems simple to reproduce I will assign this to @jfischer-no so he can take a look.

@carlescufi carlescufi assigned jfischer-no and unassigned sjanc Apr 28, 2021
@carlescufi carlescufi added the area: USB Universal Serial Bus label Apr 28, 2021
@carlescufi
Copy link
Member

@Vudentz have you seen something similar before, a packet transmitted by BlueZ over USB that doesn't go over the air when using a Zephyr-based controller?

@phantomblot-x
Copy link

The ACL Tx (host to controller) bulk end-point of the USB driver is 1 - not 2 as suggested by the the BT HCI USB spec. I faced a similar issue using a custom BT host stack that I could not transmit ACL data. If I change my host to bind ACL Tx to end-point 1, it seems to work.
It would be nice though if the driver exposed the end-points suggested by BT spec.

@1am
Copy link
Author

1am commented May 3, 2021

@phantomblot-x Could you please share how to change ACL Tx to end-point 1 to try it out?

@jfischer-no
Copy link
Collaborator

The ACL Tx (host to controller) bulk end-point of the USB driver is 1 - not 2 as suggested by the the BT HCI USB spec. I faced a similar issue using a custom BT host stack that I could not transmit ACL data. If I change my host to bind ACL Tx to end-point 1, it seems to work.
It would be nice though if the driver exposed the end-points suggested by BT spec.

Then there is an issue in your "custom BT host stack" because it does not take into account endpoint descriptors.

@carlescufi
Copy link
Member

The ACL Tx (host to controller) bulk end-point of the USB driver is 1 - not 2 as suggested by the the BT HCI USB spec. I faced a similar issue using a custom BT host stack that I could not transmit ACL data. If I change my host to bind ACL Tx to end-point 1, it seems to work.
It would be nice though if the driver exposed the end-points suggested by BT spec.

Then there is an issue in your "custom BT host stack" because it does not take into account endpoint descriptors.

That might be the case, but @1am is using vanilla BlueZ and the Linux kernel. That should work out of the box, shouldn't it?

@jfischer-no
Copy link
Collaborator

That might be the case, but @1am is using vanilla BlueZ and the Linux kernel. That should work out of the box, shouldn't it?

yes, I will try to recreate and report.

@phantomblot-x
Copy link

I don't see how the host stack is supposed to guess that endpoint 1 is for ACL Tx - end-point 1 could be a proprietary interface. Anyway, for best interoperability it should use end-point 2 as suggested by the BT spec.

@jfischer-no
Copy link
Collaborator

jfischer-no commented May 4, 2021

I can not reproduce it, neither on master nor on zephyr-v2.5.0 tag.
I am with:
hci_usb on nrf52840dk_nrf52840, peripheral_dis on nrf52833dk_nrf52833
Linux 5.10.0-4-amd64
BlueZ Version 5.55

Unlikely that it is relevant to BlueZ version, rather host hardware/controller. @1am what board exactly are you using for hci_usb?

@1am
Copy link
Author

1am commented May 4, 2021

Hello @jfischer-no I've repeated my tests on Zephyr master with the following setups:

@jfischer-no
Copy link
Collaborator

jfischer-no commented May 4, 2021

  • hci_usb on nrf52840DK, peripheral_dis on nRF52833 - not working the same way as #34593 (comment)

It should not make any difference.
Your nrf52840DK, is it nRF52840-PDK (imprinted next to the PCB antenna)? What version is it (PCA10056 ....)?

@1am
Copy link
Author

1am commented May 4, 2021

You're right it's nRF52840-Preview-DK, the version is:

PCA100056
V0.9.0 
2016.48

@jfischer-no jfischer-no added question and removed bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug labels May 4, 2021
@jfischer-no
Copy link
Collaborator

nRF52840-PDK has known USB device controller issues and is not supported by Zephyr OS. You can use hci_usb on nrf52833dk_nrf52833 for development.

@carlescufi
Copy link
Member

I don't see how the host stack is supposed to guess that endpoint 1 is for ACL Tx - end-point 1 could be a proprietary interface. Anyway, for best interoperability it should use end-point 2 as suggested by the BT spec.

@phantomblot-x We have a separate issue covering this:
#29107

@jfischer-no
Copy link
Collaborator

I don't see how the host stack is supposed to guess that endpoint 1 is for ACL Tx - end-point 1 could be a proprietary interface. Anyway, for best interoperability it should use end-point 2 as suggested by the BT spec.

Before someone reads it here and believes, that is nonsense. With USB we have device, interface, endpoint and whatever descriptors. Which describe a device completely. There is no reason at all to work with hardcoded endpoint addresses. Not even because of "suggestion" or "interoperability" 😉. If the driver on the host side can not read and interpret it then it is a bug in the driver. The interface description for hci_usb is clear, interrupt endpoint for HCI Events, Bulk IN/OUT for ACL Data, HCI command over control endpoint. There must be no "proprietary interface" whatever in the same interface.

@phantomblot-x
Copy link

If the driver on the host side can not read and interpret it then it is a bug in the driver.

In Vol 4, Part B, section 2 of the BT spec it says:
"The endpoint numbers (labeled ‘Suggested Endpoint Address’ below) may be dynamically recognized upon driver initialization – this depends on the implementation."

A good host/driver would implement this (agreed), but some hosts may not - those hosts will not work with Zephyr controllers. Too bad!
If changing a number from 1 to 2 can make Zephyr controllers work out of the box with practically any host I'd say it is worth the change, it can't be that hard to change a number from 1 to 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth area: USB Universal Serial Bus
Projects
None yet
Development

No branches or pull requests

7 participants