-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drivers: eth_enc424j600: explicitly disable INTIE after reset #35439
Conversation
drivers/ethernet/eth_enc424j600.c
Outdated
* Terminate interrupt handling thread because | ||
* the integrity of the driver is no longer given. | ||
*/ | ||
k_oops(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not accept this, the linux kernel driver ignores any unknown interrupts, no other zephyr ethernet driver does this. This k_oops and the continue needs to go entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of k_oops()
, the device stop running after that which might not be desirable in embedded device.
Is it so that getting into this branch should not be possible after the changes in second commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it so that getting into this branch should not be possible after the changes in second commit?
It should never be the case. The second commit fixes a bug that was found thanks to this brunch. Faulty driver should not just keep running. AFAIK we do not have any possibilities to tell the stack or application about an error from driver thread. I might be exaggerating, I could probably live with hiding it behind LOG_LEVEL_DBG, and it would also fit the style of the driver. But logging is for developer, in the field such an error would remains undetected.
diff --git a/drivers/ethernet/eth_enc424j600.c b/drivers/ethernet/eth_enc424j600.c
index 417b66ae07..f86718f319 100644
--- a/drivers/ethernet/eth_enc424j600.c
+++ b/drivers/ethernet/eth_enc424j600.c
@@ -484,7 +484,9 @@ static void enc424j600_rx_thread(struct enc424j600_runtime *context)
}
} else {
LOG_ERR("Unknown Interrupt, EIR: 0x%04x", eir);
- continue;
+ if (CONFIG_ETHERNET_LOG_LEVEL == LOG_LEVEL_DBG) {
+ k_oops();
+ }
}
enc424j600_set_sfru(context->dev, ENC424J600_SFR3_EIEL,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue that this bug fixed wouldn't have been an issue at all had the continue/k_oops not been there, the driver would have ignore the first spurious interrupt and then continued to process the second valid interrupt and functioned properly. Having just a log output with no continue/k_oops is fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could probably live with hiding it behind LOG_LEVEL_DBG
Your proposal looks like a good tradeoff 👍
About the error happening in the field, that is certainly possible, but in that case perhaps it is better to ignore/skip the issue than to halt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue that this bug fixed wouldn't have been an issue at all had the continue/k_oops not been there, the driver would have ignore the first spurious interrupt and then continued to process the second valid interrupt and functioned properly. Having just a log output with no continue/k_oops is fine with me.
So you mean that there is no other bug except the one suspected in line 487? So is it okay for your products that there is an interrupt in the Ethernet driver firing for no reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drivers and applications are designed to cope with simplistic errors, e.g. a false interrupt, which could be caused by many things including a error in the silicon (of which Microchip note there is a bug in the enc424j600 silicon which causes a spurious interrupt), an error in the MCU silicon or firmware or even from an external source e.g. by a motor. You show me where in the linux kernel (which is used in thousands of embedded devices) whereby a network or modem driver will kill the system if it receives an unexpected interrupt or command:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just wondering about the use of k_oops
drivers/ethernet/eth_enc424j600.c
Outdated
* Terminate interrupt handling thread because | ||
* the integrity of the driver is no longer given. | ||
*/ | ||
k_oops(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of k_oops()
, the device stop running after that which might not be desirable in embedded device.
Is it so that getting into this branch should not be possible after the changes in second commit?
The controller has several interrupt sources which are signaled via a single INT pin. Only the interrupt sources that are explicitly switched on during controller initialization may generate an interrupt signal. Currently there are only PHY Link Status Change Interrupt and RX Packet Pending Interrupt enabled. So there is no other reason why an interrupt can be triggered. Terminate interrupt handling thread on unknown interrupt only when debugging, as there are concerns that stopping thread in the field is going too far. Signed-off-by: Johann Fischer <[email protected]>
After system reset (SETETHRST) interrupt enable register (EIE) has the default value 0x8010 and global interrupt enable flag (INTIE) is set. This is not desired and the INTIE flag should be set only at the end of the initialization. Disable INTIE flag and set desired interrupts sources in a single write command just right after system reset. Resolves: zephyrproject-rtos#35091 Reported-by: Jamie McCrae <[email protected]> Signed-off-by: Johann Fischer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Johann 👍 LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not happy with the k_oops, no way should that be this code at all, but it's excluded by default so I'm done trying to explain driver development
After system reset (SETETHRST) interrupt enable register (EIE)
has the default value 0x8010 and global interrupt enable flag (INTIE)
is set. This is not desired and the INTIE flag should be set only at
the end of the initialization.
Disable INTIE flag and set desired interrupts sources in
a single write command just right after system reset.
Also clarify how unknown interrupt source should be handled.
Alternative fix for #35091
Fixes #35091