Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[csrng] Coverage of stalling behaviour at interfaces #15744

Closed
johngt opened this issue Oct 26, 2022 · 7 comments
Closed

[csrng] Coverage of stalling behaviour at interfaces #15744

johngt opened this issue Oct 26, 2022 · 7 comments
Assignees
Labels
Component:DV DV issue: testbench, test case, etc. IP:csrng Milestone:V2 Priority:P1 Priority: high
Milestone

Comments

@johngt
Copy link

johngt commented Oct 26, 2022

Coverage of stalling behaviour at interfaces? Probably should include this as we’ve seen EDN issues around this

original estimate 4

estimate 8
remaining 2022-11-15 4
remaining 2022-11-16 2
remaining 2022-11-21 0

@johngt johngt added this to the Project: M2 milestone Oct 26, 2022
@johngt
Copy link
Author

johngt commented Nov 8, 2022

CSRNG pushback on EDN. EDN then pushback on CSRNG. Etc. Both would then keep hanging

@marnovandermaas
Copy link
Contributor

We need coverage of the handshake between EDN and CSRNG as well as CSRNG and entropy source.

@vogelpi
Copy link
Contributor

vogelpi commented Nov 15, 2022

We discussed today that this has also high priority. @ctopal has been looking into that. It would be great to get some update tomorrow. :-)

@ctopal
Copy link
Contributor

ctopal commented Nov 16, 2022

Overall findings:

In the DV environment we have a transaction level model FIFO for EDN side of the communication named csrng_cmd_fifo which is has NUM_HW_APPS numbered FIFOs. Each of those FIFOs run a forever loop in parallel forks, waiting for an item generated from m_edn_agent (that is basically driven manually using add_h_user_data method for push_pull_agent).

As we are not configuring this push_pull_agent to have a FIFO and we use wait_cmd_ack, CSRNG will not store more than one request per "App" in transaction level.

We need coverage of the handshake between EDN and CSRNG as well as CSRNG and entropy source.

As I explained above, we don't have complicated behaviour in communication between EDN and CSRNG. Entropy source and CSRNG connection is even simpler considering it is a basic req/ack handshake rather than ready/valid one. The common push_pull_agent DV block comes with its own covergroups and as far as I can tell from the daily regression reports they are all getting hit.

So with all this looking around and such, I'm still struggling to understand what is needed to be done here. I checked the discussion for EDN hanging when CSRNG ready drops (in #15561). Looks like from CSRNG block level perspective a coverpoint for this specific deal might not be needed.

I'd really appreciate if someone can help me identify what specifically would be needed for this one to be considered covered.

@andreaskurth
Copy link
Contributor

Thanks for summarizing your findings, @ctopal. The functional coverage that comes with push_pull_agent is already valuable as first-order measure of how well stalling behavior at interfaces is covered.

However, as those interfaces are connected to FIFOs and arbiters controlled by state machines within CSRNG, I think it's worth adding functional coverage on at least the FIFOs in csrng_cmd_stage. Adding coverpoints on the depth_o signals of the two FIFOs would tell us how much "pressure" our tests exercise on the paths involving those FIFOs.

In a second step, I would add a cross between the full signal of the FIFO and the req/valid (for inputs) or the ack/ready (for outputs) of the port that fills or drains the FIFO, respectively. This would tell us if our tests cover the case when the FIFO is already full but the test wants to fill it further (for inputs) or not drain it (for outputs) in at least one cycle.

@ctopal
Copy link
Contributor

ctopal commented Nov 18, 2022

With #16390 we now have an overall view of the CSRNG internal stage FIFOs which is helpful to see if we are fully utilising them for each App. But we also need to have a checker for catching possible problems like #15469.

@vogelpi
Copy link
Contributor

vogelpi commented Nov 21, 2022

I've now reviewed the covergroups added by @ctopal as well as the resulting coverage based on the last nightly regression.

PR #16494 adds some ignore bins for cross points we're not expecting to hit at all (e.g. those covered by SVAs that we disable when doing FI testing) or not to hit for every possible command stage FIFO (where we have combined alerts for the FIFO errors). In addition, a new cover point is added for the read side of the command FIFOs to cover also the case when the FIFOs are not immediately read as proposed by @andreaskurth . From a quick coverage run it seems that all the points are being hit, except for some points that we expect to be hit by the disable/re-enable testing. I've added comments for clarification for these.

So, once #16494 is being merged, we can close this issue.

@vogelpi vogelpi closed this as completed Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:DV DV issue: testbench, test case, etc. IP:csrng Milestone:V2 Priority:P1 Priority: high
Projects
None yet
Development

No branches or pull requests

5 participants