Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update doc that db might be corrupted when power failure during initialization #567

Open
ahrtr opened this issue Sep 15, 2023 · 2 comments

Comments

@ahrtr
Copy link
Member

ahrtr commented Sep 15, 2023

See etcd-io/etcd#16596 (comment)

@tjungblu
Copy link
Contributor

@mj-ramos I've been playing around with the lazyfs and the torn writes today. One thing I wanted to try is to write all four init pages and immediately fsync them:
#570

IMHO, this should fix the issues described in etcd-io/etcd#16596

Yet, the lazyfs config crashes on the very first page write (seems at half the 4k page size).

[2023-09-20 16:03:09.121] [console] [info] [lazyfs.config]: loading config...
[2023-09-20 16:03:09.121] [console] [info] [lazyfs.args]: config path is 'etcd_16596.toml'
[2023-09-20 16:03:09.121] [console] [info] [lazyfs]: trying to create fifo '/tmp/faults.fifo'
[2023-09-20 16:03:09.121] [console] [info] [lazyfs.fifo]: faults fifo exists!
[2023-09-20 16:03:09.121] [console] [warning] [lazyfs.engine] pre-allocating 1073741824 bytes...
[2023-09-20 16:03:09.537] [console] [warning] [engine] Pre-allocation finished
[2023-09-20 16:03:09.537] [console] [info] [config] using a custom config
[2023-09-20 16:03:09.537] [console] [info] [config] log all operations = false, logfile = 'false'
[2023-09-20 16:03:09.537] [console] [info] [config] no. of pages   = 262144
[2023-09-20 16:03:09.537] [console] [info] [config] page size      = 4096
[2023-09-20 16:03:09.537] [console] [info] [config] block size     = 4096
[2023-09-20 16:03:09.537] [console] [info] [config] blocks / page  = 1
[2023-09-20 16:03:09.537] [console] [info] [config] apply eviction = false
[2023-09-20 16:03:09.537] [console] [info] [config] total          = 1048576 KiB, 1024 MiB, 1 GiB
[2023-09-20 16:03:09.537] [console] [info] [lazyfs.fifo]: running LazyFS...
[2023-09-20 16:03:09.584] [console] [info] [lazyfs.faults.worker]: waiting for fault commands...
[2023-09-20 16:10:22.990] [console] [warning] [lazyfs.faults]: Write to path /home/data-r/data.etcd/member/snap/db: will persist 2048 bytes from offset 2048
[2023-09-20 16:10:22.991] [console] [critical] [lazyfs.faults]: Added crash fault 
[2023-09-20 16:10:22.991] [console] [critical] Triggered fault condition (op=write,timing=after)
[2023-09-20 16:10:22.991] [console] [warning] [lazyfs.cmds]: report request submitted...
[2023-09-20 16:10:22.991] [console] [warning] [lazyfs.cmds]: report generated.
[2023-09-20 16:10:22.991] [console] [critical] Killing LazyFS pid 145628!

Mounting the file and reading it again with hexdump yields an empty / zeroed file:

>$ hexdump /home/data/data.etcd/member/snap/db
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000

Can I reconfigure lazyfs somehow to abide by the 4k page sizes?

@mj-ramos
Copy link

mj-ramos commented Nov 3, 2023

Hi!
I configured LazyFS to split the first write to the snap/db file into 4 chunks of 4096 bytes using the following configuration:

[[injection]]
type="split_write"
file="/home/data-r/data.etcd/member/snap/db"
persist=[1]
parts=4
occurrence=1

The only change needed is the parts parameter. The problematic write is 16384 bytes long, so splitting it into 4 equally sized parts results in 4 writes of 4096 bytes. I'm persisting only the first 4096 bytes since the persist vector has the value 1. I'm conducting this test to see what happens if a power failure occurs after the first write is persisted.
When I run the same test described in etcd-io/etcd#16596 with this new configuration, I encounter an error:

mvcc/backend: cannot open database at data/data.etcd/member/snap/db (file size too small)
panic: cannot open database at data/data.etcd/member/snap/db (file size too small)

This means that if a crash occurs after persisting the first 4096 bytes, etcd still fails to start. However, I might have found a possible fix for this problem.

With LazyFS, I injected a crash fault before the write to the snap/db/file following these steps (assuming that the value of the parameter fifo_path is set to /tmp/faults.fifo):

  1. Start LazyFS
  2. In a terminal, execute echo "lazyfs::crash::timing=before::op=write::from_rgx=snap/db/file" > /tmp/faults.fifo
  3. Start etcd

This fault simulates a power failure before the first write to the snap/db file in a scenario where no content of this file was already on disk. Etcd successfully restarts after this power failure, indicating that if the snapshot file is empty, etcd is able to start. Since this problem only occurs when etcd does not have any data, there is no issue with this file being empty or non-existent.

So, this problem is easily solved by adding some extra system calls and using the write to temporary file and rename strategy:

  • Create snap/db.tmp file.
  • Write everything to snap/db.tmp file.
  • Fsync snap/db.tmp.
  • Rename snap/db.tmp to snap/db.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants