Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolver misbehaves if receiving response too late #33101

Closed
hubertmis opened this issue Mar 7, 2021 · 3 comments
Closed

DNS resolver misbehaves if receiving response too late #33101

hubertmis opened this issue Mar 7, 2021 · 3 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Milestone

Comments

@hubertmis
Copy link
Member

Describe the bug
I've tried to enable date_time library from nRF Connect SDK https:/nrfconnect/sdk-nrf/blob/master/lib/date_time/date_time.c using OpenThread network. This library tries to resolve addresses of time servers during system initialization. However, OpenThread is not ready to transmit packets like DNS query during initialization and packets are queued and transmitted after a few seconds when the node attaches to the network.

The library behaves incorrectly in such circumstances. It looks that some buffers leak in these circumstances and DNS resolver is not usable anymore until I reset the device.

I was looking for the cause of the problem and I have a hypothesis that the main reason is in https:/zephyrproject-rtos/zephyr/blob/master/subsys/net/lib/dns/resolve.c . I think the failing program flow is like this:

  1. System initializes
  2. A library request resolving server addresses (DNS request is queued for TX)
  3. OpenThread starts joining the network
  4. System initialization ends
  5. Resolver timeout occurs, the library requests resolving other addresses (another DNS request is queued for TX)
  6. OpenThread joins the network and sends DNS requests from the queue
  7. The node receives DNS responses to already timed out requests
  8. Resolver parses responses to timed out requests like they were still valid

I think the problem is caused by resolver module that does not discard responses to requests that were already timed out (or cancelled).

To Reproduce
I can easily reproduce the problem using my project and custom PCBs. I didn't try to reproduce it using any dev kits or other projects.

Expected behavior
Usable DNS resolver even if responses to queries are received after query timeout.

Impact
I created workaround in my application that delays resolving requests for several seconds.
However, I'm afraid this issue may be used by an attacker to make a device unusable by delaying DNS responses.

@hubertmis hubertmis added the bug The issue is a bug, or the PR is fixing a bug label Mar 7, 2021
@pabigot
Copy link
Collaborator

pabigot commented Mar 7, 2021

The DNS subsystem is one of the ones that doesn't use the work API correctly, which could result in this behavior; see #33104.

From when I looked at it a couple months ago I'm not sure it safely manages allocations of queries slots either, at least not on multiprocessors or with preemptive threads. There are no locks involved with allocating or releasing them.

@jukkar
Copy link
Member

jukkar commented Mar 8, 2021

There are no locks involved with allocating or releasing them.

👍 yes, locks are indeed missing from the library and should be added. Any volunteers for implementing this?

@nashif nashif added the priority: low Low impact/importance bug label Mar 8, 2021
@galak galak added this to the v2.6.0 milestone May 11, 2021
@jukkar
Copy link
Member

jukkar commented Jun 1, 2021

This is already fixed by #33217

@jukkar jukkar closed this as completed Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants