Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump when using logger-log-level in additional_start_args #7878

Closed
brianzer0 opened this issue Dec 20, 2022 · 3 comments · Fixed by #12161
Closed

Core dump when using logger-log-level in additional_start_args #7878

brianzer0 opened this issue Dec 20, 2022 · 3 comments · Fixed by #12161
Assignees
Labels
area/rpk kind/bug Something isn't working sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.

Comments

@brianzer0
Copy link

Version & Environment

Redpanda version: (use rpk version): v22.3.4 (rev 5be3e8e)
Cluster Info: 10 node bare-metal
Host:

  • CPU: AMD EPYC 7702P 64-Core Processor
  • Memory: 512gb DDR5-3200 ECC
  • Disk: 12 14tb NVMe SSD in raid 0 (mdcdm software raid)

Operating System (e.g. from /etc/os-release):

NAME="Rocky Linux"
VERSION="9.0 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.0"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.0 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.0"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.0"

What went wrong?

core dump on start when trying to set logger-log-level in redpanda.yaml via additional_start_flags. Also you cannot specify --default-log-level in additional_start_flags and logger-log-level in /etc/sysconfig/redpanda START_ARGS (I get debug on all for some reason...)

    additional_start_flags:
        - --default-log-level=warn
        - --log-to-syslog=1
        - --log-to-stdout=0
        - --logger-log-level="rpc=debug"

What should have happened instead?

Should be able to set logging levels in redpanda.yaml

How to reproduce the issue?

  1. Try to specify --logger-log-level="rpc=debug" as additional_start_args in redpanda.yaml

Additional information

Here is log:

Dec 20 18:49:31 NODE9 systemd[1]: Stopped Redpanda, the fastest queue in the West.. 
Dec 20 18:49:31 NODE9 systemd[1]: Starting Redpanda, the fastest queue in the West....
Dec 20 18:49:33 NODE9 systemd[1]: Started Process Core Dump (PID 2640015/UID 0).
Dec 20 18:49:33 NODE9 systemd-coredump[2640016]: Process 2639980 (redpanda) of user 987 dumped core.#012#012Module /opt/redpanda/lib/ld.so with build-id 997994d4535f8c791738540c3e6ee8e46a752603#012Module linux-vdso.so.1 with build-id 0ed98418059b05b9e6f734d9ab4e2a5be68e600a#012Module libpciaccess.so.0 with build-id a855b56e392a8e38bd8861ef5bc99b82c5bcc7e2#012Module libgmp.so.10 with build-id 156a488dd4cfdaf68cdabe421beac302c99a8657#012Module libhogweed.so.6 with build-id 69c948b81e79f4c6f55a8808a4c69d3a8cbc7867#012Module libnettle.so.8 with build-id 092ea904c6c961e89decea597a1afe7c5d002fb4#012Module libc.so.6 with build-id 6e7b96dfb83f0bdcb6a410469b82f86415e5ada3#012Module libgcc_s.so.1 with build-id d14e4e492003556b25e1a4572b3c14bbecc0f48e#012Module libm.so.6 with build-id 2c5afe11116cc85e21a8d87e6f52a481efefa1d1#012Module libc++abi.so.1 without build-id.#012Module libc++.so.1 without build-id.#012Module libboost_filesystem.so.1.75.0 with build-id 03fbef95faa14a75968e16de23ca116760bd7402#012Module libnuma.so.1 with build-id d3ddb0e9aceff9e2ca04d7a21391b56abd891049#012Module liburing.so.2 without build-id.#012Module libhwloc.so.15 with build-id 281d008737f7cc9939b90e542842bdbf13b4c370#012Module libsctp.so.1 with build-id fa9525b3cfea98dbcb0be3be2f71e8ff12699b2e#012Module libatomic.so.1 with build-id ebb6b21cad677866c57b411106d3378352084327#012Module libgnutls.so.30 with build-id 1f5c2e8ec01f1d8b951de657165c9627586b1d11#012Module libcares.so.2 with build-id 4f4bbc715a2f0d7bb148ecdb1cc54bc2b3bf4987#012Module libboost_atomic.so.1.75.0 with build-id 0bd8633777760a40f39655086798218e126b0b4c#012Module libboost_date_time.so.1.75.0 with build-id faf034c1fae42d5825309a7f932f29e66dd4b9b2#012Module libboost_chrono.so.1.75.0 with build-id 42dadca13a00b4c19be7a471d57a47c889024b34#012Module libboost_thread.so.1.75.0 with build-id a16b7213f761351472ec926bed4f2e9479d86a59#012Module libboost_program_options.so.1.75.0 with build-id 8c42404dee9435ef4d8bf5ad33c471eb83ff06b7#012Module libcryptopp.so.8.3 with build-id d9528865dd86b571a64a10c83d7199590e0a9b0c#012Module liblz4.so.1 with build-id 4db50bdababa9636c5a64e2a96d21edc2320e70b#012Module libz.so.1 with build-id 62c2402eea0c7b943ebcafdad30959c58ffe6778#012Module libboost_regex.so.1.75.0 with build-id 00488e75d2321e5e1c92a74490607ff6a01abd71#012Module libboost_iostreams.so.1.75.0 with build-id a487bdad20a07ef8af1ce827511351698f116896#012Module redpanda with build-id bd7b618c6a483183e130c219d330e4de45b86410#012Stack trace of thread 2639980:#012#0  0x00007f9857e2de3c __pthread_kill_implementation (libc.so.6 + 0x92e3c)#012#1  0x00007f9857ddda16 raise (libc.so.6 + 0x42a16)#012#2  0x00007f9857dc782f abort (libc.so.6 + 0x2c82f)#012#3  0x00007f98580c5e66 abort_message (libc++abi.so.1 + 0x2ae66)#012#4  0x00007f98580aa29f _ZL28demangling_terminate_handlerv (libc++abi.so.1 + 0xf29f)#012#5  0x00007f98580c50f3 _ZSt11__terminatePFvvE (libc++abi.so.1 + 0x2a0f3)#012#6  0x00007f98580c5098 _ZSt9terminatev (libc++abi.so.1 + 0x2a098)#012#7  0x000055ab3b07860b __clang_call_terminate (redpanda + 0x541f60b)#012#8  0x000055ab3ac8715e _ZN7seastar12app_template14run_deprecatedEiPPcONSt3__18functionIFvvEEE (redpanda + 0x502e15e)#012#9  0x000055ab3ac83fc0 _ZN7seastar12app_template3runEiPPcONSt3__18functionIFNS_6futureIiEEvEEE (redpanda + 0x502afc0)#012#10 0x000055ab376c29bf _ZN11application3runEiPPc (redpanda + 0x1a699bf)#012#11 0x000055ab3b079d3a main (redpanda + 0x5420d3a)#012#12 0x00007f9857dc8590 __libc_start_call_main (libc.so.6 + 0x2d590)#012#13 0x00007f9857dc8649 __libc_start_main (libc.so.6 + 0x2d649)#012#14 0x000055ab376bd7a5 _start (redpanda + 0x1a647a5)#012ELF object binary architecture: AMD x86-64 
Dec 20 18:49:33 NODE9 systemd[1]: redpanda.service: Main process exited, code=dumped, status=6/ABRT
Dec 20 18:49:33 NODE9 systemd[1]: redpanda.service: Failed with result 'core-dump'.
Dec 20 18:49:33 NODE9 systemd[1]: Failed to start Redpanda, the fastest queue in the West.. 
Dec 20 18:49:33 NODE9 systemd[1]: [email protected]: Deactivated successfully.
Dec 20 18:49:34 NODE9 systemd[1]: redpanda.service: Scheduled restart job, restart counter is at 2.
@brianzer0 brianzer0 added the kind/bug Something isn't working label Dec 20, 2022
@jcsp jcsp added area/rpk sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low. labels Dec 21, 2022
@jcsp
Copy link
Contributor

jcsp commented Dec 21, 2022

I just checked that passing a --logger-log-level with double quotes directly to a redpanda binary doesn't make it barf, so my suspicion is that something is up with how this is getting passed through systemd->rpk->redpanda

@dotnwat
Copy link
Member

dotnwat commented Dec 21, 2022

@brianzer0 right after this core dumps if coredumptctl -1 list redpanda shows an entry than you should be able to then run sudo coredumpctl -1 debug redpanda and then something like thread apply all bt to get a backtrace. should help

@jcsp jcsp changed the title Code dump when using logger-log-level in additional_start_args Core dump when using logger-log-level in additional_start_args Dec 22, 2022
@r-vasquez
Copy link
Contributor

Just tested with rpk tip of dev and redpanda version v23.1.13:

I couldn't reproduce the exact symptom as OP (unable to get the core dump) but it still fails to start:

rpk logs:

# This is what's being passed to Unix.Execute:
/opt/redpanda/bin/redpanda redpanda --redpanda-cfg /etc/redpanda/redpanda.yaml --lock-memory=false --overprovisioned --unsafe-bypass-fsync=true --logger-log-level="rpc=debug" --reserve-memory=0M

# Failure is
Jul 17 09:13:43 fedora rpk[18851]: libc++abi: terminating with uncaught exception of type std::runtime_error: Unknown log level 'debug"'

So I ran strace to peek on execve (which rpk uses under the hood to execute redpanda) and we can see that the quotes are being escaped:

1689611782.019939 execve("/usr/bin/bash", ["bash", "/usr/bin/redpanda", "--redpanda-cfg", "/etc/redpanda/redpanda.yaml", "--logger-log-level=\"rpc=debug\"", "--overprovisioned", "--unsafe-bypass-fsync=true", "--lock-memory=false", "--reserve-memory=0M"], 0x7ffcadf43e20 /* 21 vars */) = 0 <0.000108>

If I ran the same command directly in the terminal without escaping the quotes, the shell will strip the quotes (for example echo --logger-log-level="rpc=debug" will return --logger-log-level=rpc=debug) and redpanda starts. But if I escape the quotes I get the same error as if it were run with rpk.

So, in rpk we can:

  1. Send a patch to strip the quotes in the flag value before it's passed to execve.
    or
  2. Send an error message if we encounter a quoted flag value.

r-vasquez added a commit to r-vasquez/redpanda that referenced this issue Jul 17, 2023
These are meant to be passed to the shell as we
were typing them into the terminal. Normally,
shells strip quotes from the string.

Fixes redpanda-data#7878
r-vasquez added a commit to r-vasquez/redpanda that referenced this issue Jul 17, 2023
These are meant to be passed to the shell as we
were typing them into the terminal. Normally,
shells strip quotes from the string.

Fixes redpanda-data#7878
r-vasquez added a commit to r-vasquez/redpanda that referenced this issue Jul 17, 2023
These are meant to be passed to the shell as we
were typing them into the terminal. Normally,
shells strip quotes from the string.

Fixes redpanda-data#7878
vbotbuildovich pushed a commit to vbotbuildovich/redpanda that referenced this issue Jul 18, 2023
These are meant to be passed to the shell as we
were typing them into the terminal. Normally,
shells strip quotes from the string.

Fixes redpanda-data#7878

(cherry picked from commit 0ede235)
graphcareful pushed a commit to graphcareful/redpanda that referenced this issue Aug 2, 2023
These are meant to be passed to the shell as we
were typing them into the terminal. Normally,
shells strip quotes from the string.

Fixes redpanda-data#7878
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rpk kind/bug Something isn't working sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants