-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: Use k_fifo instead of k_work in RX and TX processing #34703
net: Use k_fifo instead of k_work in RX and TX processing #34703
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using dedicated threads instead of a work queue thread that's only ever going to be asked to run one handler function makes a lot more sense, especially when you don't need delays or cancellation. If you need flush you can probably handle it better with the dedicated solution anyway.
f6fd624
to
4c184b7
Compare
@alexanderwachter I converted 6locan to not use k_work in net_pkt. Unfortunately I have no way to test the changes, so asking would you be able to review and possibly check if the changes work as expected. |
@mniestroj @tsvehagen You seem to have modified the offloading esp wifi driver last time. The driver uses net_pkt for its internal use but we want to get rid of the k_work in net_pkt. Unfortunately the driver looked somewhat convoluted so I was not able to modify it. Any suggestion how to solve this issue? |
I think the use of k_work in net_pkt was a quite recent change. Not sure if it's easy to revert or if we should solve it some other way. What do you think @mniestroj? |
Does anyone have time to review this one, it would be nice to get this merged by Friday? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ways commits are split makes it easy to review and understand the changes. However, if someone runs git bisect
while investigating their wifi esp code and then happen to checkout
- net: Use k_fifo instead of k_work in RX and TX processing
they will get a compile time error. Maybe it would be better to squash the two commits
- drivers: wifi: esp: Temporary fix to allow compilation
- net: Use k_fifo instead of k_work in RX and TX processing
together?
Perhaps, or we could move the esp change before the main commit
I did not want to do too big changes with naming. The thread is basically doing "work" here, just that it is not happening via k_work. Function naming leads always to interesting discussion, so if there are suggestions what these changed functions should be called, I am all ears. |
4c184b7
to
400bc50
Compare
/** TX queue */ | ||
struct k_fifo tx_queue; | ||
/** RX error queue */ | ||
struct k_fifo rx_err_queue; | ||
/** Queue handler thread */ | ||
struct k_thread queue_handler; | ||
/** Queue handler thread stack */ | ||
K_KERNEL_STACK_MEMBER(queue_stack, 512); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like "600 bytes * number of CAN contexts" more RAM is used after this change. I am fine with this change as is, but we might want to use in the future single thread for CAN, same as for PPP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like "600 bytes * number of CAN contexts" more RAM is used after this change. I am fine with this change as is, but we might want to use in the future single thread for CAN, same as for PPP.
Yeah, the 6locan driver is almost supporting multiple contexts / interfaces. It looks like the purpose was to support multiple context but it there are some bits and pieces missing to fully support this. I did not want to start to tweak this driver for now, but it can be done later if really needed. I will leave this to @alexanderwachter to decide.
include/net/net_pkt.h
Outdated
#if defined(CONFIG_WIFI_ESP) | ||
/* TODO: The work is to be removed after the esp offloading | ||
* wifi driver is modified to use its own private work struct. | ||
*/ | ||
struct k_work esp_work; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to find some time to fix/change ESP driver, it won't happen this week. I wonder if there might be other downstream drivers that use pkt->work
. net_pkt_work()
API was kind of public, so I wonder if we should follow deprecation policy here for other drivers. So for example instead of introducing esp_work
and removing work
, add something like:
struct net_pkt {
union {
#ifdef CONFIG_NET_PKT_WORK
/** Internal variable that is used when packet is sent
* or received.
*/
struct k_work work;
#endif
...
with hidden Kconfig option:
config NET_PKT_WORK
bool
which could be selected by ESP driver and any other downstream drivers or L2 layers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The k_work
is strictly for internal use so we are free to remove it if needed. Any downstream user using it will need to change its code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then what makes struct k_work work;
different from struct net_pkt_cursor cursor;
? They are both described as Internal variable...
or Internal buffer...
, which means that neither should be used by downstream code, right? But if someone wants to implement driver like ESP-AT (but for different chip), then net_pkt_cursor_init()
still needs to be used for properly initializing packet before pushing it to other networking layers. This however makes pkt->cursor
no longer internal. Well, it all depends on what "internal" really means.
Maybe we should better describe which structure members are reserved by native networking stack? And prevent using them by networking drivers, like ESP-AT does right now with pkt->work
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should better describe which structure members are reserved by native networking stack? And prevent using them by networking drivers, like ESP-AT does right now with
pkt->work
?
Yes, that would be a good thing to do. It is a bit difficult to prevent the code from using the "internal" variables as we need to have net_pkt in the "open". The access functions for the internal variables are not documented so any downstream code should not use these vars anyway. For your cursor example, the cursor manipulation API is there and documented so I would not consider it as internal. The apps should never access the variables directly anyway.
@jukkar I will look into ESP driver, but not this week. Unfortunately reverting to k_fifo (as was used before) is not an option, due to bugs that using |
Yes, I understand. I looked at the driver but it seemed quite a lot of work so did not try to do it myself. I think we should merge this PR like this and ESP changes can come when they are ready. |
400bc50
to
ec308c3
Compare
ec308c3
to
966e987
Compare
I agree that this patchset leads to mixed feelings. On one hand, it definitely cleans up structure and processing - it's quite natural that, if we need to decouple IRQ-based and normal processing, then to put packets in a queue. On the other hand, that leads to proliferation of different queues, and most importantly, handlers threads which are just duplication of each other and differ only in which actual handler function they call. It immediately leads to an idea to add a handler callback to net_pkt, and a common handler thread which would just call the handler. But that's apparently the reason why workqueue was used in the first place! Because it offers just that functionality, except it has the stipulation which net_pkt handling doesn't adhere too. All in all, I guess it's a good change, but worth being further optimized to minimize number of processing threads (their stacks) somehow. (Not giving +1 right away, as I didn't test these changes yet.) |
seems wrong. IMO just I made some measurements with
when taking average from multiple measurements in TX path. In case of RX, I see maybe 1us difference from total of 25us - too little to make any conclusions. |
I do like I don't like zephyr/subsys/net/l2/ppp/fsm.c Lines 537 to 545 in 126da28
is simply wrong. Calling either k_work_submit or k_fifo_put can reschedule threads, so the original problem is still there, but possibly is not reproduced that often.
|
The number of packets is shown there, for k_work it was 21922 and for fifo 15079.
You do not have any TC threads in your tests (note that the default number of TX threads is now 0), so you will not see any difference in your test between k_work and k_fifo. |
Okay. I used around 700 (and sometimes 7000) which is equivalent of 100 (1000) runs of
I do have TX TC thread. I forgot to mention, that I specifically added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generic net changes LGTM.
966e987
to
ac009f8
Compare
Fixed merge conflict |
ac009f8
to
0fc857c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ESP changes LGTM
@jukkar please rebase |
Following commits will remove k_work from net_pkt, so convert 6locan L2 to use k_fifo between application and TX thread, and driver and RX error handler. Signed-off-by: Jukka Rissanen <[email protected]>
Following commits will remove k_work from net_pkt, so convert PPP L2 to use k_fifo when sending PPP data. Signed-off-by: Jukka Rissanen <[email protected]>
The k_work handler cannot manipulate the used k_work. This means that it is not easy to cleanup the net_pkt because it contains k_work in it. Because of this, use k_fifo instead between RX thread and network driver, and between application and TX thread. A echo-server/client run with IPv4 and UDP gave following results: Using k_work ------------ TX traffic class statistics: TC Priority Sent pkts bytes time [0] BK (1) 21922 5543071 103 us [0->41->26->34=101 us] [1] BE (0) 0 0 - RX traffic class statistics: TC Priority Recv pkts bytes time [0] BK (0) 0 0 - [1] BE (0) 21925 6039151 97 us [0->21->16->37->20=94 us] Using k_fifo ------------ TX traffic class statistics: TC Priority Sent pkts bytes time [0] BK (1) 15079 3811118 94 us [0->36->23->32=91 us] [1] BE (0) 0 0 - RX traffic class statistics: TC Priority Recv pkts bytes time [0] BK (1) 0 0 - [1] BE (0) 15073 4150947 79 us [0->17->12->32->14=75 us] So using k_fifo gives about 10% better performance with same workload. Fixes zephyrproject-rtos#34690 Signed-off-by: Jukka Rissanen <[email protected]>
0fc857c
to
2f435fc
Compare
For reference, I tested this (with master of f05ea67) on frdm_k64f with dumb_http_server/big_http_download samples, both work well. In my testing, big_http_download has higher throughput without this patch, but that's not scientific, as I downloaded different amounts of data each time, and it's affected by Internet connection anyway. Trying to test on qemu_x86, I hit #34964 (not related to this patch, present in master too), so didn't look into it further. |
The k_work handler cannot manipulate the used k_work. This means
that it is not easy to cleanup the net_pkt because it contains
k_work in it. Because of this, use k_fifo instead between
RX thread and network driver, and between application and TX
thread.
A echo-server/client run with IPv4 and UDP gave following
results:
So using k_fifo gives about 10% better performance with same
workload.
Fixes #34690
Signed-off-by: Jukka Rissanen [email protected]