Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix zebra shutdown crash #2730

Closed
wants to merge 2 commits into from

Conversation

qlyoung
Copy link
Member

@qlyoung qlyoung commented Jul 26, 2018

Hopefully fixes #2656 and friends.

Client list deletion function pointer was bound to a function that expected the owning pthread to be dead before it was called, but since the pthread was never killed, it deleted out the resources from under it.

Edit:
see #2736

Otherwise known as "what is an async-safe signal handler?"

Signed-off-by: Quentin Young <[email protected]>
@FRRouting FRRouting deleted a comment from LabN-CI Jul 26, 2018
@LabN-CI
Copy link
Collaborator

LabN-CI commented Jul 26, 2018

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/2730 73e6b97
Date 07/25/2018
Start 20:50:13
Finish 21:13:10
Run-Time 22:57
Total 1813
Pass 1813
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2018-07-25-20:50:13.txt
Log autoscript-2018-07-25-20:50:55.log.bz2

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4623/

This is a comment from an EXPERIMENTAL automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 2730, comparing to Git base SHA c11b11d

No Changes in Static Analysis warnings compared to base

5 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4623/artifact/shared/static_analysis/index.html

@FRRouting FRRouting deleted a comment from NetDEF-CI Jul 26, 2018
@FRRouting FRRouting deleted a comment from NetDEF-CI Jul 26, 2018
@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: FAILED

See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/

This is a comment from an EXPERIMENTAL automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source and apply patch from patchwork: Successful

Building Stage: Successful

Basic Tests: Failed

CentOS 6 rpm pkg check: Successful
Static analyzer (clang): Successful
Ubuntu 12.04 deb pkg check: Successful
IPv6 protocols on Ubuntu 14.04: Successful
Ubuntu 16.04 deb pkg check: Successful
IPv4 ldp protocol on Ubuntu 16.04: Successful
Ubuntu 14.04 deb pkg check: Successful
CentOS 7 rpm pkg check: Successful
Addresssanitizer topotest: Successful
Debian 8 deb pkg check: Successful
IPv4 protocols on Ubuntu 14.04: Successful
Fedora 24 rpm pkg check: Successful
Debian 9 deb pkg check: Successful

Topotest tests on Ubuntu 16.04 i386: Failed

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-4639/test

Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:

RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument

r1: zebra crashed. Core file found - Backtrace follows:
[New LWP 20690]
[New LWP 20669]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xb7e7aa32 in hash_get (hash=0x5fd040, data=0xb63feec8, alloc_func=0xb7ea5dd0 <cpu_record_hash_alloc>) at lib/hash.c:144
[Current thread is 1 (Thread 0xb63ffb40 (LWP 20690))]
#0  0xb7e7aa32 in hash_get (hash=0x5fd040, data=0xb63feec8, alloc_func=0xb7ea5dd0 <cpu_record_hash_alloc>) at lib/hash.c:144
#1  0xb7ea5f6a in thread_get (m=m@entry=0x5dcf70, type=type@entry=3 '\003', func=func@entry=0x521020 <zserv_handle_client_close>, arg=0x80a1f8, funcname=0x54417b "zserv_handle_client_close", schedfrom=0x543ef4 "zebra/zserv.c", fromln=819) at lib/thread.c:719
#2  0xb7ea7863 in funcname_thread_add_event (m=0x5dcf70, func=0x521020 <zserv_handle_client_close>, arg=0x80a1f8, val=0, t_ptr=0x0, funcname=0x54417b "zserv_handle_client_close", schedfrom=0x543ef4 "zebra/zserv.c", fromln=819) at lib/thread.c:952
#3  0x005220f9 in zserv_event (event=ZSERV_HANDLE_CLOSE, client=0x80a1f8) at zebra/zserv.c:818
#4  zserv_client_close (client=client@entry=0x80a1f8) at zebra/zserv.c:174
#5  0x0052269d in zserv_read (thread=0xb63ff230) at zebra/zserv.c:441
#6  0xb7ea8539 in thread_call (thread=0xb63ff230) at lib/thread.c:1576
#7  0xb7e7825e in fpt_run (arg=0x80d888) at lib/frr_pthread.c:320
#8  0xb7e25295 in start_thread (arg=0xb63ffb40) at pthread_create.c:333
#9  0xb7d500ae in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:114
2018-07-26 23:18:22,606 ERROR: assert failed at "test_ldp_vpls_topo1/test_memory_leak": 
r1: zebra crashed. Core file found - Backtrace follows:
[New LWP 20690]
[New LWP 20669]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xb7e7aa32 in hash_get (hash=0x5fd040, data=0xb63feec8, alloc_func=0xb7ea5dd0 <cpu_record_hash_alloc>) at lib/hash.c:144
[Current thread is 1 (Thread 0xb63ffb40 (LWP 20690))]
#0  0xb7e7aa32 in hash_get (hash=0x5fd040, data=0xb63feec8, alloc_func=0xb7ea5dd0 <cpu_record_hash_alloc>) at lib/hash.c:144
#1  0xb7ea5f6a in thread_get (m=m@entry=0x5dcf70, type=type@entry=3 '\003', func=func@entry=0x521020 <zserv_handle_client_close>, arg=0x80a1f8, funcname=0x54417b "zserv_handle_client_close", schedfrom=0x543ef4 "zebra/zserv.c", fromln=819) at lib/thread.c:719
#2  0xb7ea7863 in funcname_thread_add_event (m=0x5dcf70, func=0x521020 <zserv_handle_client_close>, arg=0x80a1f8, val=0, t_ptr=0x0, funcname=0x54417b "zserv_handle_client_close", schedfrom=0x543ef4 "zebra/zserv.c", fromln=819) at lib/thread.c:952
#3  0x005220f9 in zserv_event (event=ZSERV_HANDLE_CLOSE, client=0x80a1f8) at zebra/zserv.c:818
#4  zserv_client_close (client=client@entry=0x80a1f8) at zebra/zserv.c:174
#5  0x0052269d in zserv_read (thread=0xb63ff230) at zebra/zserv.c:441
#6  0xb7ea8539 in thread_call (thread=0xb63ff230) at lib/thread.c:1576
#7  0xb7e7825e in fpt_run (arg=0x80d888) at lib/frr_pthread.c:320
#8  0xb7e25295 in start_thread (arg=0xb63ffb40) at pthread_create.c:333
#9  0xb7d500ae in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:114

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/artifact/TOPOI386/ErrorLog/log_topotests.txt

Topology tests on Ubuntu 16.04 amd64: Failed

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOU1604-4639/test

Topology Tests failed for Topology tests on Ubuntu 16.04 amd64:

r1: zebra crashed. Core file found - Backtrace follows:
[New LWP 16821]
[New LWP 16822]
[New LWP 16783]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f65ab78c428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
[Current thread is 1 (Thread 0x7f65a980c700 (LWP 16821))]
#0  0x00007f65ab78c428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f65ab78e02a in __GI_abort () at abort.c:89
#2  0x00007f65ac1a3f1a in core_handler (signo=11, siginfo=0x7f65a980b3b0, context=0x7f65a980b280) at lib/sigevent.c:247
#3  <signal handler called>
#4  0x00007f65ac184895 in hash_get (hash=0x0, data=data@entry=0x7f65a980b8c0, alloc_func=alloc_func@entry=0x7f65ac1ae4b0 <cpu_record_hash_alloc>) at lib/hash.c:141
#5  0x00007f65ac1ae617 in thread_get (m=m@entry=0x556caf0125d0, type=type@entry=3 '\003', func=func@entry=0x556cad038fe0 <zserv_process_messages>, arg=arg@entry=0x556caf1e58c0, funcname=0x556cad05c17b "zserv_process_messages", schedfrom=schedfrom@entry=0x556cad05bebc "zebra/zserv.c", fromln=815) at lib/thread.c:719
#6  0x00007f65ac1afe6e in funcname_thread_add_event (m=0x556caf0125d0, func=func@entry=0x556cad038fe0 <zserv_process_messages>, arg=arg@entry=0x556caf1e58c0, val=val@entry=0, t_ptr=t_ptr@entry=0x0, funcname=funcname@entry=0x556cad05c17b "zserv_process_messages", schedfrom=0x556cad05bebc "zebra/zserv.c", fromln=815) at lib/thread.c:952
#7  0x0000556cad03a2a2 in zserv_event (event=ZSERV_PROCESS_MESSAGES, client=0x556caf1e58c0) at zebra/zserv.c:814
#8  zserv_read (thread=<optimized out>) at zebra/zserv.c:425
#9  0x00007f65ac1b0b20 in thread_call (thread=thread@entry=0x7f65a980bcf0) at lib/thread.c:1576
#10 0x00007f65ac1821b0 in fpt_run (arg=0x556caf1e7a00) at lib/frr_pthread.c:320
#11 0x00007f65abb286ba in start_thread (arg=0x7f65a980c700) at pthread_create.c:333
#12 0x00007f65ab85e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
2018-07-26 14:10:18,774 ERROR: assert failed at "test_bgp_ecmp_topo1/test_bgp_ecmp": 
r1: zebra crashed. Core file found - Backtrace follows:
[New LWP 16821]
[New LWP 16822]
[New LWP 16783]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f65ab78c428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
[Current thread is 1 (Thread 0x7f65a980c700 (LWP 16821))]
#0  0x00007f65ab78c428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f65ab78e02a in __GI_abort () at abort.c:89
#2  0x00007f65ac1a3f1a in core_handler (signo=11, siginfo=0x7f65a980b3b0, context=0x7f65a980b280) at lib/sigevent.c:247
#3  <signal handler called>
#4  0x00007f65ac184895 in hash_get (hash=0x0, data=data@entry=0x7f65a980b8c0, alloc_func=alloc_func@entry=0x7f65ac1ae4b0 <cpu_record_hash_alloc>) at lib/hash.c:141
#5  0x00007f65ac1ae617 in thread_get (m=m@entry=0x556caf0125d0, type=type@entry=3 '\003', func=func@entry=0x556cad038fe0 <zserv_process_messages>, arg=arg@entry=0x556caf1e58c0, funcname=0x556cad05c17b "zserv_process_messages", schedfrom=schedfrom@entry=0x556cad05bebc "zebra/zserv.c", fromln=815) at lib/thread.c:719
#6  0x00007f65ac1afe6e in funcname_thread_add_event (m=0x556caf0125d0, func=func@entry=0x556cad038fe0 <zserv_process_messages>, arg=arg@entry=0x556caf1e58c0, val=val@entry=0, t_ptr=t_ptr@entry=0x0, funcname=funcname@entry=0x556cad05c17b "zserv_process_messages", schedfrom=0x556cad05bebc "zebra/zserv.c", fromln=815) at lib/thread.c:952
#7  0x0000556cad03a2a2 in zserv_event (event=ZSERV_PROCESS_MESSAGES, client=0x556caf1e58c0) at zebra/zserv.c:814
#8  zserv_read (thread=<optimized out>) at zebra/zserv.c:425
#9  0x00007f65ac1b0b20 in thread_call (thread=thread@entry=0x7f65a980bcf0) at lib/thread.c:1576
#10 0x00007f65ac1821b0 in fpt_run (arg=0x556caf1e7a00) at lib/frr_pthread.c:320
#11 0x00007f65abb286ba in start_thread (arg=0x7f65a980c700) at pthread_create.c:333
#12 0x00007f65ab85e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/artifact/TOPOU1604/ErrorLog/log_topotests.txt

Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/artifact/TOPOI386/MemoryLeaks/
Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/artifact/TOPOU1604/MemoryLeaks/

CLANG Static Analyzer Summary

  • Github Pull Request 2730, comparing to Git base SHA c11b11d

No Changes in Static Analysis warnings compared to base

5 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4639/artifact/shared/static_analysis/index.html

@qlyoung
Copy link
Member Author

qlyoung commented Jul 26, 2018

guess not

@qlyoung qlyoung closed this Jul 26, 2018
@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: FAILED

See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4646/

This is a comment from an EXPERIMENTAL automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source and apply patch from patchwork: Successful

Building Stage: Successful

Basic Tests: Failed

Addresssanitizer topotest: Successful
Debian 8 deb pkg check: Successful
Topology tests on Ubuntu 16.04 amd64: Successful
IPv4 protocols on Ubuntu 14.04: Successful
IPv4 ldp protocol on Ubuntu 16.04: Successful
Debian 9 deb pkg check: Successful
Ubuntu 14.04 deb pkg check: Successful
Static analyzer (clang): Successful
CentOS 7 rpm pkg check: Successful
IPv6 protocols on Ubuntu 14.04: Successful
Ubuntu 16.04 deb pkg check: Successful
Ubuntu 12.04 deb pkg check: Successful
Fedora 24 rpm pkg check: Successful
CentOS 6 rpm pkg check: Successful

Topotest tests on Ubuntu 16.04 i386: Failed

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-4646/test

Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:

ce2: zebra crashed. Core file found - Backtrace follows:
[New LWP 6011]
[New LWP 6010]
[New LWP 5998]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGABRT, Aborted.
#0  0xb7fcad05 in ?? ()
[Current thread is 1 (Thread 0xb71a0b40 (LWP 6011))]
#0  0xb7fcad05 in ?? ()
#1  <signal handler called>
#2  0xb7f44a1f in hash_get (hash=0x0, data=0xb719fec8, alloc_func=0xb7f6fdd0 <cpu_record_hash_alloc>) at lib/hash.c:141
#3  0xb7f6ff6a in thread_get (m=m@entry=0x999f70, type=type@entry=3 '\003', func=func@entry=0x519020 <zserv_handle_client_close>, arg=0xa9b130, funcname=0x53c17b "zserv_handle_client_close", schedfrom=0x53bef4 "zebra/zserv.c", fromln=819) at lib/thread.c:719
#4  0xb7f71863 in funcname_thread_add_event (m=0x999f70, func=0x519020 <zserv_handle_client_close>, arg=0xa9b130, val=0, t_ptr=0x0, funcname=0x53c17b "zserv_handle_client_close", schedfrom=0x53bef4 "zebra/zserv.c", fromln=819) at lib/thread.c:952
#5  0x0051a0f9 in zserv_event (event=ZSERV_HANDLE_CLOSE, client=0xa9b130) at zebra/zserv.c:818
#6  zserv_client_close (client=client@entry=0xa9b130) at zebra/zserv.c:174
#7  0x0051a69d in zserv_read (thread=0xb71a0230) at zebra/zserv.c:441
#8  0xb7f72539 in thread_call (thread=0xb71a0230) at lib/thread.c:1576
#9  0xb7f4225e in fpt_run (arg=0xa9b7a0) at lib/frr_pthread.c:320
#10 0xb7eef295 in start_thread (arg=0xb71a0b40) at pthread_create.c:333
#11 0xb7e1a0ae in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:114
2018-07-27 02:28:33,325 ERROR: assert failed at "bgp_l3vpn_to_bgp_direct.test_bgp_l3vpn_to_bgp_direct/test_memory_leak": 
ce2: zebra crashed. Core file found - Backtrace follows:
[New LWP 6011]
[New LWP 6010]
[New LWP 5998]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGABRT, Aborted.
#0  0xb7fcad05 in ?? ()
[Current thread is 1 (Thread 0xb71a0b40 (LWP 6011))]
#0  0xb7fcad05 in ?? ()
#1  <signal handler called>
#2  0xb7f44a1f in hash_get (hash=0x0, data=0xb719fec8, alloc_func=0xb7f6fdd0 <cpu_record_hash_alloc>) at lib/hash.c:141
#3  0xb7f6ff6a in thread_get (m=m@entry=0x999f70, type=type@entry=3 '\003', func=func@entry=0x519020 <zserv_handle_client_close>, arg=0xa9b130, funcname=0x53c17b "zserv_handle_client_close", schedfrom=0x53bef4 "zebra/zserv.c", fromln=819) at lib/thread.c:719
#4  0xb7f71863 in funcname_thread_add_event (m=0x999f70, func=0x519020 <zserv_handle_client_close>, arg=0xa9b130, val=0, t_ptr=0x0, funcname=0x53c17b "zserv_handle_client_close", schedfrom=0x53bef4 "zebra/zserv.c", fromln=819) at lib/thread.c:952
#5  0x0051a0f9 in zserv_event (event=ZSERV_HANDLE_CLOSE, client=0xa9b130) at zebra/zserv.c:818
#6  zserv_client_close (client=client@entry=0xa9b130) at zebra/zserv.c:174
#7  0x0051a69d in zserv_read (thread=0xb71a0230) at zebra/zserv.c:441
#8  0xb7f72539 in thread_call (thread=0xb71a0230) at lib/thread.c:1576
#9  0xb7f4225e in fpt_run (arg=0xa9b7a0) at lib/frr_pthread.c:320
#10 0xb7eef295 in start_thread (arg=0xb71a0b40) at pthread_create.c:333
#11 0xb7e1a0ae in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:114

RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4646/artifact/TOPOI386/ErrorLog/log_topotests.txt

Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4646/artifact/TOPOI386/MemoryLeaks/

CLANG Static Analyzer Summary

  • Github Pull Request 2730, comparing to Git base SHA c11b11d

No Changes in Static Analysis warnings compared to base

5 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4646/artifact/shared/static_analysis/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zebra (stream_set_getp) assert seen in topotest
3 participants