-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: overload the PosixFileSystemAdaptor interface to generate real snapshot when follower installs snapshot #245
Conversation
…snapshot when follower installs snapshot
}; | ||
node_options_.checkpoint_callback = checkpoint_callback; | ||
snapshot_adaptor_ = new PPosixFileSystemAdaptor(); | ||
node_options_.snapshot_file_system_adaptor = &snapshot_adaptor_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node_options里有一个定时打快照的参数snapshot_interval_s,可以考虑置零。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经置0了
src/praft/praft.cc
Outdated
if (!node_) { | ||
return ERROR_LOG_AND_STATUS("Node is not initialized"); | ||
} | ||
braft::SynchronizedClosure done; | ||
node_->snapshot(&done); | ||
node_->snapshot(&done); // @todo self_snapshot_index | ||
done.wait(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里考虑到我们想通过rocksdb触发event listerner去调用这个函数,不一定需要同步等待。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
所以可以考虑多一个参数用来控制是否进行等待。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
src/praft/psnapshot.cc
Outdated
braft::FileAdaptor* PPosixFileSystemAdaptor::open(const std::string& path, int oflag, | ||
const ::google::protobuf::Message* file_meta, butil::File::Error* e) { | ||
// checkpoint callback | ||
PRAFT.GenerateRealSnapshot(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里会驱动on_snapshot_save去生成快照,如果系统一直向前运行,这时on_snapshot_save会生成在一个新的目录下而不是当前PPosixFileSystemAdaptor::open要找到的目录吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
或者说我们通过定制传进去的snapshot index这个接口中还能覆盖到上一次生成的目录里?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…ta file is modified
std::string prefix = "local://" + g_config.dbpath + "_praft"; | ||
node_options_.log_uri = prefix + "/log"; | ||
node_options_.raft_meta_uri = prefix + "/raft_meta"; | ||
node_options_.snapshot_uri = prefix + "/snapshot"; | ||
// node_options_.disable_cli = FLAGS_disable_cli; | ||
snapshot_adaptor_ = new PPosixFileSystemAdaptor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里使用new申请了,虽然只申请了一次,但是最好有对应delete的操作。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scoped_refptrbraft::FileSystemAdaptor snapshot_adaptor_ = nullptr;
这样定义的,应该不需要主动调用delete吧
src/praft/praft.cc
Outdated
|
||
void PRaft::recursive_copy(const std::filesystem::path& source, const std::filesystem::path& destination) { | ||
if (std::filesystem::is_regular_file(source)) { | ||
if (source.filename() == "__raft_snapshot_meta") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个对应的宏是BRAFT_SNAPSHOT_META_FILE,按理应该使用宏比较好,不过它放在cpp里。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看后面自己实现了一个宏了,那就用起来。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯 这里应该是漏掉了
src/praft/psnapshot.cc
Outdated
|
||
braft::FileAdaptor* PPosixFileSystemAdaptor::open(const std::string& path, int oflag, | ||
const ::google::protobuf::Message* file_meta, butil::File::Error* e) { | ||
if ((oflag & 0x01) == 0) { // This is a read operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x01应该使用宏,看代码打开meta文件时 oflag & O_RDONLY | O_CLOEXEC 应该 == 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的 它这个应该是"#define O_RDONLY 00",&之后应该0吧
src/praft/psnapshot.cc
Outdated
std::string snapshot_path; | ||
|
||
// parse snapshot path | ||
if (found_pos != std::string::npos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brpc/src/butil/files/file_path.h 里有一个路径的util类,可以直接用来处理路径信息,还能跨平台。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
INFO("start generate snapshot"); | ||
braft::LocalSnapshotMetaTable snapshot_meta_memtable; | ||
std::string meta_path = snapshot_path + "/" PBRAFT_SNAPSHOT_META_FILE; | ||
braft::FileSystemAdaptor* fs = braft::default_file_system(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里好像this做个cast就行?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我试过使用this,运行的时候会有点问题
src/praft/psnapshot.cc
Outdated
} | ||
|
||
// check whether snapshots have been created | ||
std::lock_guard guard(mutex_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
被braft调用的代码跑在bthread里可以考虑使用butex。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好嘞
@@ -36,6 +36,8 @@ void RaftNodeCmd::DoCmd(PClient* client) { | |||
DoCmdAdd(client); | |||
} else if (!strcasecmp(cmd.c_str(), "REMOVE")) { | |||
DoCmdRemove(client); | |||
} else if (!strcasecmp(cmd.c_str(), "DSS")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这块这个 DSS 也可以改一下, 当时为了产生快照为了方便随便用了个缩写
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -132,7 +124,7 @@ Status Storage::CreateCheckpoint(const std::string& dump_path, int i) { | |||
|
|||
// 3) Create a checkpoint | |||
std::unique_ptr<rocksdb::Checkpoint> checkpoint_guard(checkpoint); | |||
s = checkpoint->CreateCheckpoint(tmp_dir, kNoFlush, nullptr); | |||
s = checkpoint->CreateCheckpoint(tmp_dir, kFlush, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
还需要每次 checkpoint 的时候 flush 吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emmm 目前来看应该是不需要的了 这个快可以改掉其实
(注:主要代码修改在最后一次提交中 feat: overload the PosixFileSystemAdaptor interface to generate real …)
为了支持日常的快照仅仅只是推进日志的截断,避免raft日志过大,占用过多的磁盘空间,在braft代码中更改了Node::Snapshot接口,可以自定义设置快照的截断位置(参考pr:https:/pikiwidb/braft/pull/2)。
同时,需要满足在follower节点需要进行快照安装时能够把leader的快照发送给follower,有两种解决方案:
(1)直接在braft代码里修改(参考pr:https:/pikiwidb/braft/pull/3),不过这种方式不是很优雅。
(2)直接重载PosixFileSystemAdaptor::open接口,这个函数在下面图中的FileServiceImpl::get_file接口中调用(file_service.cpp),我们只需要在follower真正读取快照数据之前,同步生成好需要的快照数据即可。此外,还可以根据min(所有Column Family已经持久化到磁盘上的数据的最大SequenceNum对应的log index)设置快照的截断位置,只需要在pikiwidb层修改代码即可,完全不需要修改braft的代码,比较优雅。
leader:
follower
测试:
运行save_load.sh脚本,脚本中leader连续做了两次数据插入(每次10000条数据)和两次快照截断,第二次执行完成之后快照的截断点在20001,但快照数据是空的,follower节点加入集群之后,截断点为20001的快照被填充了真正的快照数据。