[csrng] Coverage of stalling behaviour at interfaces #15744

johngt · 2022-10-26T13:29:44Z

Coverage of stalling behaviour at interfaces? Probably should include this as we’ve seen EDN issues around this

original estimate 4

estimate 8
remaining 2022-11-15 4
remaining 2022-11-16 2
remaining 2022-11-21 0

johngt · 2022-11-08T10:30:40Z

CSRNG pushback on EDN. EDN then pushback on CSRNG. Etc. Both would then keep hanging

marnovandermaas · 2022-11-15T10:43:57Z

We need coverage of the handshake between EDN and CSRNG as well as CSRNG and entropy source.

vogelpi · 2022-11-15T21:24:05Z

We discussed today that this has also high priority. @ctopal has been looking into that. It would be great to get some update tomorrow. :-)

ctopal · 2022-11-16T01:06:14Z

Overall findings:

In the DV environment we have a transaction level model FIFO for EDN side of the communication named csrng_cmd_fifo which is has NUM_HW_APPS numbered FIFOs. Each of those FIFOs run a forever loop in parallel forks, waiting for an item generated from m_edn_agent (that is basically driven manually using add_h_user_data method for push_pull_agent).

As we are not configuring this push_pull_agent to have a FIFO and we use wait_cmd_ack, CSRNG will not store more than one request per "App" in transaction level.

We need coverage of the handshake between EDN and CSRNG as well as CSRNG and entropy source.

As I explained above, we don't have complicated behaviour in communication between EDN and CSRNG. Entropy source and CSRNG connection is even simpler considering it is a basic req/ack handshake rather than ready/valid one. The common push_pull_agent DV block comes with its own covergroups and as far as I can tell from the daily regression reports they are all getting hit.

So with all this looking around and such, I'm still struggling to understand what is needed to be done here. I checked the discussion for EDN hanging when CSRNG ready drops (in #15561). Looks like from CSRNG block level perspective a coverpoint for this specific deal might not be needed.

I'd really appreciate if someone can help me identify what specifically would be needed for this one to be considered covered.

andreaskurth · 2022-11-16T11:31:52Z

Thanks for summarizing your findings, @ctopal. The functional coverage that comes with push_pull_agent is already valuable as first-order measure of how well stalling behavior at interfaces is covered.

However, as those interfaces are connected to FIFOs and arbiters controlled by state machines within CSRNG, I think it's worth adding functional coverage on at least the FIFOs in csrng_cmd_stage. Adding coverpoints on the depth_o signals of the two FIFOs would tell us how much "pressure" our tests exercise on the paths involving those FIFOs.

In a second step, I would add a cross between the full signal of the FIFO and the req/valid (for inputs) or the ack/ready (for outputs) of the port that fills or drains the FIFO, respectively. This would tell us if our tests cover the case when the FIFO is already full but the test wants to fill it further (for inputs) or not drain it (for outputs) in at least one cycle.

ctopal · 2022-11-18T11:21:12Z

With #16390 we now have an overall view of the CSRNG internal stage FIFOs which is helpful to see if we are fully utilising them for each App. But we also need to have a checker for catching possible problems like #15469.

vogelpi · 2022-11-21T17:03:36Z

I've now reviewed the covergroups added by @ctopal as well as the resulting coverage based on the last nightly regression.

PR #16494 adds some ignore bins for cross points we're not expecting to hit at all (e.g. those covered by SVAs that we disable when doing FI testing) or not to hit for every possible command stage FIFO (where we have combined alerts for the FIFO errors). In addition, a new cover point is added for the read side of the command FIFOs to cover also the case when the FIFOs are not immediately read as proposed by @andreaskurth . From a quick coverage run it seems that all the points are being hit, except for some points that we expect to be hit by the disable/re-enable testing. I've added comments for clarification for these.

So, once #16494 is being merged, we can close this issue.

johngt added this to the Project: M2 milestone Oct 26, 2022

GregAC assigned andreaskurth and ctopal Nov 8, 2022

ctopal mentioned this issue Nov 16, 2022

[csrng,dv] Add CG to check FIFOs in cmd_stage #16390

Merged

vogelpi closed this as completed Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[csrng] Coverage of stalling behaviour at interfaces #15744

[csrng] Coverage of stalling behaviour at interfaces #15744

johngt commented Oct 26, 2022 •

edited by vogelpi

Loading

johngt commented Nov 8, 2022

marnovandermaas commented Nov 15, 2022

vogelpi commented Nov 15, 2022

ctopal commented Nov 16, 2022

andreaskurth commented Nov 16, 2022

ctopal commented Nov 18, 2022

vogelpi commented Nov 21, 2022

[csrng] Coverage of stalling behaviour at interfaces #15744

[csrng] Coverage of stalling behaviour at interfaces #15744

Comments

johngt commented Oct 26, 2022 • edited by vogelpi Loading

johngt commented Nov 8, 2022

marnovandermaas commented Nov 15, 2022

vogelpi commented Nov 15, 2022

ctopal commented Nov 16, 2022

andreaskurth commented Nov 16, 2022

ctopal commented Nov 18, 2022

vogelpi commented Nov 21, 2022

johngt commented Oct 26, 2022 •

edited by vogelpi

Loading