Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enc424j600 does not work #35091

Closed
thedjnK opened this issue May 11, 2021 · 8 comments · Fixed by #35439
Closed

enc424j600 does not work #35091

thedjnK opened this issue May 11, 2021 · 8 comments · Fixed by #35439
Assignees
Labels
area: Ethernet area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Milestone

Comments

@thedjnK
Copy link
Collaborator

thedjnK commented May 11, 2021

Describe the bug
We are using a enc424j600 device with our BL5340 board (nRF5340-based) and running the sample apps for network (e.g. dumb_http_server) results in a failed networking start up. There is a strange way to get it working however, if the module is booted with the enc424j600's interrupt line disconnect and then connected after a bootup

To Reproduce
Build dumb_http_server for enc424j600 and run it

Expected behavior
With the changes made:

*** Booting Zephyr OS build zephyr-v2.4.0-3426-g70ec734b87c3  ***
[00:00:00.368,804] <dbg> ethdrv.enc424j600_init_filters: ERXFCON: 0x005b
[00:00:00.369,140] <dbg> ethdrv.enc424j600_init_phy: PHANA: 0x05e1
[00:00:00.369,354] <dbg> ethdrv.enc424j600_init_phy: PHCON1: 0x1200
[00:00:00.369,537] <dbg> ethdrv.enc424j600_init: EIE: 0x8850
[00:00:00.369,598] <dbg> ethdrv.enc424j600_init: ECON1: 0x0001
[00:00:00.369,659] <inf> ethdrv: ENC424J600 Initialized
[00:00:00.375,549] <inf> net_config: Initializing network
[00:00:00.375,549] <inf> net_config: Waiting interface 1 (0x20000f0c) to be up...
Single-threaded dumb HTTP server waits for a connection on port 8080...
[00:00:05.609,771] <err> ethdrv: handling int
[00:00:05.609,893] <dbg> ethdrv.enc424j600_rx_thread: ESTAT: 0xdf00
[00:00:05.609,924] <inf> ethdrv: Link up
[00:00:05.610,137] <dbg> ethdrv.enc424j600_setup_mac: PHANLPA: 0x85e1
[00:00:05.610,321] <inf> ethdrv: 100Mbps
[00:00:05.610,321] <inf> ethdrv: full duplex
[00:00:05.610,473] <dbg> ethdrv.enc424j600_setup_mac: MACON2: 0x40b3
[00:00:05.610,504] <dbg> ethdrv.enc424j600_setup_mac: MAMXFL (maximum frame length): 1518
[00:00:05.610,534] <inf> ethdrv: Not suspended
[00:00:05.610,565] <inf> net_config: Interface 1 (0x20000f0c) coming up
[00:00:05.610,656] <inf> net_config: IPv4 address: 192.168.1.55

Impact
Driver is useless, network is useless, showstopper

Logs and console output
Output with interrupt line connected as normal:

*** Booting Zephyr OS build zephyr-v2.4.0-3426-g70ec734b87c3  ***
[00:00:00.629,699] <dbg> ethdrv.enc424j600_init_filters: ERXFCON: 0x005b
[00:00:00.630,035] <dbg> ethdrv.enc424j600_init_phy: PHANA: 0x05e1
[00:00:00.630,218] <dbg> ethdrv.enc424j600_init_phy: PHCON1: 0x1200
[00:00:00.630,432] <dbg> ethdrv.enc424j600_init: EIE: 0x8850
[00:00:00.630,493] <dbg> ethdrv.enc424j600_init: ECON1: 0x0001
[00:00:00.630,523] <err> ethdrv: handling int
[00:00:00.630,615] <inf> ethdrv: ENC424J600 Initialized
[00:00:00.630,676] <dbg> ethdrv.enc424j600_rx_thread: ESTAT: 0xda00
[00:00:00.630,706] <inf> ethdrv: Link down
[00:00:00.630,737] <err> ethdrv: handling int
[00:00:00.630,859] <dbg> ethdrv.enc424j600_rx_thread: ESTAT: 0x5a00
[00:00:00.630,859] <err> ethdrv: Unknown Interrupt, EIR: 0x0700
[00:00:00.636,718] <inf> net_config: Initializing network
[00:00:00.636,749] <inf> net_config: Waiting interface 1 (0x20000f0c) to be up...
Single-threaded dumb HTTP server waits for a connection on port 8080...
[00:00:24.637,725] <err> net_config: Timeout while waiting network interface
[00:00:24.637,725] <err> net_config: Network initialization failed (-115)

Environment (please complete the following information):

  • OS: Windows 10 64-bit
  • Toolchain: GNU ARM GCC
  • Commit SHA: From nRF connect SDK 1.5.99
@thedjnK thedjnK added the bug The issue is a bug, or the PR is fixing a bug label May 11, 2021
@jfischer-no jfischer-no self-assigned this May 11, 2021
@galak galak added this to the v2.6.0 milestone May 11, 2021
@jfischer-no jfischer-no added priority: low Low impact/importance bug and removed priority: medium Medium impact/importance bug labels May 11, 2021
@jfischer-no
Copy link
Collaborator

I can not reproduce it with link_board_eth shield connected to nrf5340dk_nrf5340. Please check interrupt (INT) line, level shifter, pullups on your board.

@nordicjm
Copy link
Collaborator

@jfischer-no There are 3 bits for interrupts that are not implemented as per the datasheet http://ww1.microchip.com/downloads/en/devicedoc/39935b.pdf page 120 in the high byte (2 bits in the lower byte too), the lower 3 bits are reserved, that is "Ignore on read, don’t care on write", in the chip I have here, they are being returned as set hence the following error "[00:00:00.259,124] ethdrv: Unknown Interrupt, EIR: 0x0700"

The code in the driver works by waiting for an interrupt, then in a thread it waits for a semaphore and once received, disables interrupts on the enc424j600, it reads and then handles the interrupt and re-enables interrupts except in the case of it being an invalid interrupt, in which case it leaves interrupts disabled. This is not valid operation.

	while (true) {
		k_sem_take(&context->int_sem, K_FOREVER);

		enc424j600_clear_sfru(context->dev, ENC424J600_SFR3_EIEL,
				      ENC424J600_EIE_INTIE); <-- disable interrupts
		enc424j600_read_sfru(context->dev, ENC424J600_SFRX_EIRL, &eir);
		enc424j600_read_sfru(context->dev,
				     ENC424J600_SFRX_ESTATL, &estat);
		LOG_DBG("ESTAT: 0x%04x", estat);

		if (eir & ENC424J600_EIR_PKTIF) {
...
		} else if (eir & ENC424J600_EIR_LINKIF) {
...
		} else {
			LOG_ERR("Unknown Interrupt, EIR: 0x%04x", eir);
			continue; <-- skips the below line and leaves interrupts disabled, this is the bug
		}

		enc424j600_set_sfru(context->dev, ENC424J600_SFR3_EIEL,
				    ENC424J600_EIE_INTIE); <-- enable interrupts

So the issue may be that you have a different silicon revision of the enc424j600 chip or for any other number of reasons, however, the bug is with the driver code.

@nordicjm
Copy link
Collaborator

It could also be related to errata number 4 as per https://ww1.microchip.com/downloads/en/DeviceDoc/80477A.pdf

@nordicjm
Copy link
Collaborator

A logic trace of the interrupt line:
image
The first is the expected one and the second one what triggers the 'unknown interrupt', indeed the ESTAT register does show that an interrupt is not pending, so this could also be handled by only checking the interrupt source if the bit in ESTAT is set and if not, it continues on and re-enables the interrupt, or reads ESTAT first and doesn't disable the interrupt at all if that bit is not set since there is no interrupt to handle.

@nordicjm
Copy link
Collaborator

In fact that might even work better, moving the ESTAT read first and then only running the rest of the code if the MSB is set gets rid of the second blip altogether:
image
I can change the fix to do that instead if preferred, thoughts?

@jfischer-no
Copy link
Collaborator

Only two interrupt source are enabled PHYLNK and PKTCNT. "the INT pin is driven low and the INT flag (ESTAT<15>) becomes set" only for these interrupt sources, not for any others. continue at L:487 is intentional to detect errors like spurious interrupts.

There are 3 bits for interrupts that are not implemented as per the datasheet http://ww1.microchip.com/downloads/en/devicedoc/39935b.pdf page 120 in the high byte (2 bits in the lower byte too), the lower 3 bits are reserved, that is "Ignore on read, don’t care on write", in the chip I have here, they are being returned as set hence the following error "[00:00:00.259,124] ethdrv: Unknown Interrupt, EIR: 0x0700"

It is not relevant because these interrupt sources are not enabled by the driver, and thus will not generate interrupt.

The code in the driver works by waiting for an interrupt, then in a thread it waits for a semaphore and once received, disables interrupts on the enc424j600, it reads and then handles the interrupt and re-enables interrupts except in the case of it being an invalid interrupt, in which case it leaves interrupts disabled. This is not valid operation.

It is valid, I intentionally designed it that way, see above.

So the issue may be that you have a different silicon revision of the enc424j600 chip or for any other number of reasons, however, the bug is with the driver code.

Well that can be, surely you have checked your hardware and signal integrity, then it must be the driver.

The first is the expected one and the second one what triggers the 'unknown interrupt', indeed the ESTAT register does show that an interrupt is not pending,....

Maybe it is not the driver?

@nordicjm
Copy link
Collaborator

I enclose a zip with 2 traces and 2 outputs, one from before the changes and one from after. Logic traces can be viewed with saleae's viewer available from https://support.saleae.com/logic-software channel 0 is SPI clock, channel 1 is SPI MOSI, channel 2 is SPI MISO and channel 3 is the interrupt. You can use the saleae logic SPI decoder to view the SPI traffic from a glance.
eth_checks.zip

@jfischer-no
Copy link
Collaborator

For the record: I was able to reproduce it on nRF5340DK by changing used SPI node to spi2 (the same used on BL5340 board). I can also confirm that clearing INTIE flag in EIE register right after reset (SETETHRST) fixes the issue.
Since I do not agree with the proposed changes in #35092, I have opened alternative #35439.

jfischer-no added a commit to jfischer-no/zephyr that referenced this issue May 20, 2021
After system reset (SETETHRST) interrupt enable register (EIE)
has the default value 0x8010 and global interrupt enable flag (INTIE)
is set. This is not desired and the INTIE flag should be set only at
the end of the initialization.

Disable INTIE flag and set desired interrupts sources in
a single write command just right after system reset.

Resolves: zephyrproject-rtos#35091

Reported-by: Jamie McCrae <[email protected]>
Signed-off-by: Johann Fischer <[email protected]>
nashif pushed a commit that referenced this issue May 21, 2021
After system reset (SETETHRST) interrupt enable register (EIE)
has the default value 0x8010 and global interrupt enable flag (INTIE)
is set. This is not desired and the INTIE flag should be set only at
the end of the initialization.

Disable INTIE flag and set desired interrupts sources in
a single write command just right after system reset.

Resolves: #35091

Reported-by: Jamie McCrae <[email protected]>
Signed-off-by: Johann Fischer <[email protected]>
andvib pushed a commit to andvib/zephyr that referenced this issue May 31, 2021
…r reset

After system reset (SETETHRST) interrupt enable register (EIE)
has the default value 0x8010 and global interrupt enable flag (INTIE)
is set. This is not desired and the INTIE flag should be set only at
the end of the initialization.

Disable INTIE flag and set desired interrupts sources in
a single write command just right after system reset.

Resolves: zephyrproject-rtos#35091

Reported-by: Jamie McCrae <[email protected]>
Signed-off-by: Johann Fischer <[email protected]>
(cherry picked from commit de974ef)
Signed-off-by: Andrzej Puzdrowski <[email protected]>
jori-nordic pushed a commit to jori-nordic/zephyr that referenced this issue Jun 15, 2021
…r reset

After system reset (SETETHRST) interrupt enable register (EIE)
has the default value 0x8010 and global interrupt enable flag (INTIE)
is set. This is not desired and the INTIE flag should be set only at
the end of the initialization.

Disable INTIE flag and set desired interrupts sources in
a single write command just right after system reset.

Resolves: zephyrproject-rtos#35091

Reported-by: Jamie McCrae <[email protected]>
Signed-off-by: Johann Fischer <[email protected]>
(cherry picked from commit de974ef)
Signed-off-by: Andrzej Puzdrowski <[email protected]>
(cherry picked from commit 1c6eaa2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Ethernet area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
5 participants