Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Pthread in zebra crashes on shutdown #2427

Closed
donaldsharp opened this issue Jun 13, 2018 · 2 comments
Closed

New Pthread in zebra crashes on shutdown #2427

donaldsharp opened this issue Jun 13, 2018 · 2 comments
Assignees
Labels

Comments

@donaldsharp
Copy link
Member

donaldsharp commented Jun 13, 2018

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f6a4772e801 in __GI_abort () at abort.c:79
#2  0x00007f6a48bbf624 in _zlog_assert_failed (assertion=0x49fcf4 "!client->t_write", 
    file=0x49fa39 "zebra/zserv.c", line=625, function=0x49fc94 "zserv_handle_client_close")
    at lib/log.c:710
#3  0x000000000047c6c6 in zserv_handle_client_close (thread=0x7ffd79580808) at zebra/zserv.c:625
#4  0x00007f6a48bf38bc in thread_call (thread=0x7ffd79580808) at lib/thread.c:1534
#5  0x00007f6a48bbb41f in frr_run (master=0x195b2b0) at lib/libfrr.c:879
#6  0x0000000000423481 in main (argc=1, argv=0x7ffd79580af8) at zebra/main.c:394
(gdb) p client->pthread
$2 = (struct frr_pthread *) 0x1dab0b0
(gdb) p *client->pthread
$3 = {mtx = pthread_mutex_t = {Type = Normal, Status = Not acquired, Robust = No, Shared = No, 
    Protocol = None}, thread = 140094393415424, master = 0x1dab130, attr = {id = 3, 
    start = 0x7f6a48baae70 <fpt_run>, stop = 0x7f6a48baafc0 <fpt_halt>}, 
  running_cond = 0x1dab4f0, running_cond_mtx = 0x1dab4c0, running = false, data = 0x0, 
  name = 0x7f6a40000dd0 "Zebra API client thread"}
(gdb) p client->sock
$4 = 32
(gdb) p *client
$5 = {pthread = 0x1dab0b0, sock = 32, ibuf_mtx = pthread_mutex_t = {Type = Normal, 
    Status = Not acquired, Robust = No, Shared = No, Protocol = None}, ibuf_fifo = 0x1da80b0, 
  obuf_mtx = pthread_mutex_t = {Type = Normal, Status = Not acquired, Robust = No, Shared = No, 
    Protocol = None}, obuf_fifo = 0x1da82e0, ibuf_work = 0x1da8330, obuf_work = 0x1da9030, 
  wb = 0x1da9060, t_read = 0x0, t_write = 0x1dab590, rtm_table = 0, mi_redist = {{{
        enabled = 0 '\000', instances = 0x0} <repeats 26 times>}, {{enabled = 0 '\000', 
        instances = 0x0} <repeats 26 times>}, {{enabled = 0 '\000', 
        instances = 0x0} <repeats 26 times>}, {{enabled = 0 '\000', 
        instances = 0x0} <repeats 26 times>}}, redist = {{0x0 <repeats 26 times>}, {0x1da9080, 
      0x1ed7a10, 0x1edbb90, 0x2087ef0, 0x2089f00, 0x208bf10, 0x1c705e0, 0x1c725f0, 0x1c74600, 
      0x1fe3e50, 0x1fe5e60, 0x1fe7e70, 0x1fe9e80, 0x1febe90, 0x1fedea0, 0x1fefeb0, 0x1ff1ec0, 
      0x1ff3ed0, 0x1ff5ee0, 0x1ff7ef0, 0x1ff9f00, 0x1ffbf10, 0x1ffdf20, 0x1ffff30, 0x2001f40, 
      0x2003f50}, {0x220ff40, 0x2211f50, 0x2213f60, 0x2215f70, 0x2217f80, 0x2219f90, 0x221bfa0, 
      0x221dfb0, 0x221ffc0, 0x2221fd0, 0x2223fe0, 0x2225ff0, 0x2228000, 0x222a010, 0x222c020, 
      0x222e030, 0x2230040, 0x22b3fe0, 0x22b5ff0, 0x22b8000, 0x22ba010, 0x22bc020, 0x22be030, 
      0x22c0040, 0x22c2050, 0x22c4060}, {0x22c6070, 0x22c8080, 0x22ca090, 0x22cc0a0, 0x22ce0b0, 
      0x22d00c0, 0x22d20d0, 0x22d40e0, 0x22d60f0, 0x22d8100, 0x22da110, 0x22dc120, 0x22de130, 
      0x22e0140, 0x22e2150, 0x22e4160, 0x22e6170, 0x22e8180, 0x22ea190, 0x22ec1a0, 0x22ee1b0, 
      0x22f01c0, 0x22f21d0, 0x22f41e0, 0x22f61f0, 0x22f8200}}, redist_default = 0x22fa210, 
  ifinfo = 0x22fc220, ridinfo = 0x22fe230, notify_owner = false, proto = 9 '\t', instance = 0, 
  is_synchronous = 0 '\000', redist_v4_add_cnt = 0, redist_v4_del_cnt = 0, 
  redist_v6_add_cnt = 0, redist_v6_del_cnt = 0, v4_route_add_cnt = 4, v4_route_upd8_cnt = 0, 
  v4_route_del_cnt = 0, v6_route_add_cnt = 0, v6_route_del_cnt = 0, v6_route_upd8_cnt = 0, 
  connected_rt_add_cnt = 14, connected_rt_del_cnt = 0, ifup_cnt = 0, ifdown_cnt = 0, 
  ifadd_cnt = 8, ifdel_cnt = 0, if_bfd_cnt = 0, bfd_peer_add_cnt = 0, bfd_peer_upd8_cnt = 0, 
  bfd_peer_del_cnt = 0, bfd_peer_replay_cnt = 0, vrfadd_cnt = 2, vrfdel_cnt = 0, 
  if_vrfchg_cnt = 0, bfd_client_reg_cnt = 1, vniadd_cnt = 0, vnidel_cnt = 0, l3vniadd_cnt = 0, 
  l3vnidel_cnt = 0, macipadd_cnt = 0, macipdel_cnt = 0, prefixadd_cnt = 0, prefixdel_cnt = 0, 
  nh_reg_time = 34514, nh_dereg_time = 0, nh_last_upd_time = 34519, connect_time = 34460, 
  last_read_time = 34519, last_write_time = 34515, last_read_cmd = 1, last_write_cmd = 25}
(gdb) 

This happens every time I run the topotests/bgp_l3vpn_to_bgp_vrf test. Looks like zebra in one of the rX always crash( it's slightly different every time on which one crashes) The stack trace is the same though.

@donaldsharp
Copy link
Member Author

I think this is a variant of the same thing:
r1: zebra crashed. Core file found - Backtrace follows:
[New LWP 23705]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f257f6fa428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#0 0x00007f257f6fa428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f257f6fc02a in __GI_abort () at abort.c:89
#2 0x00007f25800fca74 in _zlog_assert_failed (assertion=assertion@entry=0x556acb147492 "!client->t_write", file=file@entry=0x556acb14745c "zebra/zserv.c", line=line@entry=627, function=function@entry=0x556acb147890 <func.15934> "zserv_handle_client_close") at lib/log.c:710
#3 0x0000556acb125028 in zserv_handle_client_close (thread=) at zebra/zserv.c:627
#4 0x00007f258011ea30 in thread_call (thread=thread@entry=0x7ffd69433910) at lib/thread.c:1576
#5 0x00007f25800f9f58 in frr_run (master=0x556acd3615d0) at lib/libfrr.c:879
#6 0x0000556acb0df1c4 in main (argc=1, argv=0x7ffd69433bf8) at zebra/main.c:394
r1: Daemon bgpd not running
2018-06-21 04:06:25,258 ERROR: assert failed at "bgp_vrf_netns.test_bgp_vrf_netns_topo/test_bgp_vrf_learn": r1: Daemon bgpd not running

@rzalamena
Copy link
Member

Is this still a problem? This was opened a while ago, if it is still happening it would be best to open a new issue with more recent trace/logs.

@polychaeta autoclose in 1 week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants