Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ospf6d: Stop crash in ospf6_write #13897

Merged
merged 1 commit into from
Jul 2, 2023

Conversation

donaldsharp
Copy link
Member

I'm seeing crashes in ospf6_write on the assert(node). The only sequence of events that I see that could possibly cause this to happen is this:

a) Someone has scheduled a outgoing write to the ospf6->t_write and placed item(s) on the ospf6->oi_write_q
b) A decision is made in ospf6_send_lsupdate() to send an immediate packet via a event_execute(..., ospf6_write,....). c) ospf6_write is called and the oi_write_q is cleaned out. d) the t_write event is now popped and the oi_write_q is empty and FRR asserts on the assert(node)

When event_execute is called for ospf6_write, just cancel the t_write event. If ospf6_write has more data to send at the end of the function it will reschedule itself. I've only seen this crash one time and am unable to reliably reproduce this at all. But this is the only mechanism that I can see that could make this happen, given how little the oi_write_q is actually touched in code.

I'm seeing crashes in ospf6_write on the `assert(node)`.  The only
sequence of events that I see that could possibly cause this to happen
is this:

a) Someone has scheduled a outgoing write to the ospf6->t_write and
placed item(s) on the ospf6->oi_write_q
b) A decision is made in ospf6_send_lsupdate() to send an immediate
packet via a event_execute(..., ospf6_write,....).
c) ospf6_write is called and the oi_write_q is cleaned out.
d) the t_write event is now popped and the oi_write_q is empty
and FRR asserts on the `assert(node)` <crash>

When event_execute is called for ospf6_write, just cancel the t_write
event.  If ospf6_write has more data to send at the end of the function
it will reschedule itself.  I've only seen this crash one time and am
unable to reliably reproduce this at all.  But this is the only mechanism
that I can see that could make this happen, given how little the oi_write_q
is actually touched in code.

Signed-off-by: Donald Sharp <[email protected]>
@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jun 30, 2023

Continuous Integration Result: FAILED

Continuous Integration Result: FAILED

Test incomplete. See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-12687/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source / Pull Request: Successful

Building Stage: Successful

Basic Tests: Incomplete

Addresssanitizer topotests part 4: Incomplete (check logs for details)
Successful on other platforms/tests
  • Debian 9 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 1
  • Ubuntu 18.04 deb pkg check
  • Addresssanitizer topotests part 1
  • Ubuntu 20.04 deb pkg check
  • Topotests Ubuntu 18.04 i386 part 4
  • Topotests debian 10 amd64 part 7
  • Topotests Ubuntu 18.04 amd64 part 9
  • Addresssanitizer topotests part 8
  • Topotests Ubuntu 18.04 i386 part 3
  • Topotests Ubuntu 18.04 i386 part 8
  • Debian 10 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 6
  • Topotests Ubuntu 18.04 arm8 part 3
  • Topotests debian 10 amd64 part 1
  • Addresssanitizer topotests part 5
  • Topotests debian 10 amd64 part 4
  • Topotests Ubuntu 18.04 arm8 part 0
  • Topotests Ubuntu 18.04 i386 part 9
  • Topotests debian 10 amd64 part 2
  • Topotests Ubuntu 18.04 amd64 part 8
  • Addresssanitizer topotests part 0
  • Static analyzer (clang)
  • Topotests debian 10 amd64 part 3
  • Topotests Ubuntu 18.04 amd64 part 5
  • Topotests debian 10 amd64 part 5
  • Topotests Ubuntu 18.04 arm8 part 2
  • Topotests Ubuntu 18.04 i386 part 1
  • Topotests Ubuntu 18.04 i386 part 6
  • Topotests Ubuntu 18.04 amd64 part 4
  • Addresssanitizer topotests part 9
  • Topotests Ubuntu 18.04 arm8 part 6
  • Topotests Ubuntu 18.04 arm8 part 7
  • Addresssanitizer topotests part 3
  • Topotests debian 10 amd64 part 6
  • Topotests Ubuntu 18.04 i386 part 7
  • Topotests Ubuntu 18.04 arm8 part 1
  • Addresssanitizer topotests part 7
  • Topotests Ubuntu 18.04 i386 part 2
  • Topotests Ubuntu 18.04 arm8 part 8
  • Topotests debian 10 amd64 part 0
  • Topotests Ubuntu 18.04 amd64 part 7
  • Addresssanitizer topotests part 6
  • Topotests Ubuntu 18.04 i386 part 5
  • Topotests Ubuntu 18.04 i386 part 0
  • Topotests debian 10 amd64 part 9
  • CentOS 7 rpm pkg check
  • Topotests Ubuntu 18.04 amd64 part 0
  • Topotests Ubuntu 18.04 amd64 part 3
  • Topotests Ubuntu 18.04 amd64 part 2
  • Topotests debian 10 amd64 part 8
  • Topotests Ubuntu 18.04 arm8 part 9
  • Addresssanitizer topotests part 2

@donaldsharp
Copy link
Member Author

ci:rerun ci system looks like it lost it's mind

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jul 1, 2023

Continuous Integration Result: FAILED

Continuous Integration Result: FAILED

See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-12695/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source / Pull Request: Successful

Building Stage: Successful

Basic Tests: Failed

Topotests Ubuntu 18.04 i386 part 3: Failed (click for details) Topotests Ubuntu 18.04 i386 part 3: Unknown Log URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-12695/artifact/TOPO3U18I386/TopotestDetails/

Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO3U18I386-12695/test

Topology Tests failed for Topotests Ubuntu 18.04 i386 part 3
see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-12695/artifact/TOPO3U18I386/TopotestLogs/log_topotests.txt

Successful on other platforms/tests
  • Topotests Ubuntu 18.04 i386 part 4
  • Static analyzer (clang)
  • Topotests debian 10 amd64 part 7
  • Topotests Ubuntu 18.04 arm8 part 0
  • Debian 9 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 9
  • Addresssanitizer topotests part 8
  • Topotests debian 10 amd64 part 2
  • Topotests Ubuntu 18.04 amd64 part 8
  • Debian 10 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 6
  • Topotests Ubuntu 18.04 amd64 part 1
  • Addresssanitizer topotests part 6
  • Ubuntu 18.04 deb pkg check
  • Ubuntu 20.04 deb pkg check
  • Topotests debian 10 amd64 part 1
  • Topotests debian 10 amd64 part 6
  • Addresssanitizer topotests part 5
  • Topotests Ubuntu 18.04 arm8 part 1
  • Addresssanitizer topotests part 4
  • Addresssanitizer topotests part 0
  • Topotests debian 10 amd64 part 4
  • Topotests debian 10 amd64 part 3
  • Topotests Ubuntu 18.04 arm8 part 3
  • Topotests Ubuntu 18.04 i386 part 9
  • Addresssanitizer topotests part 1
  • Addresssanitizer topotests part 9
  • Topotests debian 10 amd64 part 0
  • Topotests debian 10 amd64 part 5
  • Topotests Ubuntu 18.04 arm8 part 7
  • Topotests Ubuntu 18.04 arm8 part 2
  • Topotests Ubuntu 18.04 i386 part 1
  • Topotests Ubuntu 18.04 i386 part 7
  • Topotests Ubuntu 18.04 i386 part 6
  • Topotests Ubuntu 18.04 i386 part 2
  • Topotests Ubuntu 18.04 arm8 part 8
  • Addresssanitizer topotests part 7
  • Topotests Ubuntu 18.04 amd64 part 5
  • Topotests Ubuntu 18.04 arm8 part 6
  • Addresssanitizer topotests part 3
  • Topotests Ubuntu 18.04 amd64 part 7
  • Topotests Ubuntu 18.04 i386 part 0
  • Topotests Ubuntu 18.04 i386 part 5
  • Topotests Ubuntu 18.04 amd64 part 2
  • Topotests Ubuntu 18.04 i386 part 8
  • Topotests debian 10 amd64 part 8
  • Topotests Ubuntu 18.04 amd64 part 4
  • CentOS 7 rpm pkg check
  • Addresssanitizer topotests part 2
  • Topotests Ubuntu 18.04 arm8 part 9
  • Topotests Ubuntu 18.04 amd64 part 0
  • Topotests Ubuntu 18.04 amd64 part 3
  • Topotests debian 10 amd64 part 9

@donaldsharp
Copy link
Member Author

ci:rerun

@donaldsharp
Copy link
Member Author

@Mergifyio backport dev/9.0 stable/8.5 stable/8.4

@mergify
Copy link

mergify bot commented Jul 1, 2023

backport dev/9.0 stable/8.5 stable/8.4

✅ Backports have been created

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-12701/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

@ton31337 ton31337 added this to the 9.0 milestone Jul 2, 2023
@ton31337 ton31337 merged commit b893852 into FRRouting:master Jul 2, 2023
ton31337 added a commit that referenced this pull request Jul 3, 2023
ospf6d: Stop crash in ospf6_write (backport #13897)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants