Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SlaveDB will sink into Error state and never retry if the bgsave operation in master end takes more than 10 seconds at the begining of full sync stage #2664

Closed
cheniujh opened this issue May 21, 2024 · 0 comments
Labels
☢️ Bug Something isn't working

Comments

@cheniujh
Copy link
Collaborator

cheniujh commented May 21, 2024

Is this a regression?

No

Description

SlaveDB will enter an Error state and will not retry if the bgsave operation on the master takes more than 10 seconds during the full sync stage. As shown in the screenshot below, databases in the Error state will never attempt to rebuild the sync connection, which make it a difficult thing to build the replication-connect between master and slave in the scenarios of multi-DB or Single-DB with a large amount of data.
image

在全量同步阶段开始时,如果master上的bgsave操作耗时超过10秒,SlaveDB将进入Error状态,并且不会重试拉取或者重连。如下图所示,处于Error状态的DB将不会尝试重新建立主从连接,会一直这样维持Error状态, 这会导致在多DB场景或者单DB但数据量大的情况下,主从较难成功建立连接 。
在多DB场景中,这种情况很容易发生,如果master包含大量数据,单DB场景下也可能发生这种情况。
image

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

No response

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
☢️ Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants