Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#2688] Deterministic Database Shards #3363

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Builds/CMake/RippledCore.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,7 @@ target_sources (rippled PRIVATE
src/ripple/nodestore/impl/DatabaseNodeImp.cpp
src/ripple/nodestore/impl/DatabaseRotatingImp.cpp
src/ripple/nodestore/impl/DatabaseShardImp.cpp
src/ripple/nodestore/impl/DeterministicShard.cpp
src/ripple/nodestore/impl/DecodedBlob.cpp
src/ripple/nodestore/impl/DummyScheduler.cpp
src/ripple/nodestore/impl/EncodedBlob.cpp
Expand Down
4 changes: 2 additions & 2 deletions Builds/CMake/deps/Nudb.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ if (is_root_project) # NuDB not needed in the case of xrpl_core inclusion build
FetchContent_Declare(
nudb_src
GIT_REPOSITORY https:/CPPAlliance/NuDB.git
GIT_TAG 2.0.1
GIT_TAG 2.0.3
)
FetchContent_GetProperties(nudb_src)
if(NOT nudb_src_POPULATED)
Expand All @@ -23,7 +23,7 @@ if (is_root_project) # NuDB not needed in the case of xrpl_core inclusion build
ExternalProject_Add (nudb_src
PREFIX ${nih_cache_path}
GIT_REPOSITORY https:/CPPAlliance/NuDB.git
GIT_TAG 2.0.1
GIT_TAG 2.0.3
CONFIGURE_COMMAND ""
BUILD_COMMAND ""
TEST_COMMAND ""
Expand Down
19 changes: 19 additions & 0 deletions src/ripple/nodestore/Backend.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,25 @@ class Backend
virtual void
open(bool createIfMissing = true) = 0;

/** Open the backend.
@param createIfMissing Create the database files if necessary.
@param appType Deterministic appType used to create a backend.
@param uid Deterministic uid used to create a backend.
@param salt Deterministic salt used to create a backend.
This allows the caller to catch exceptions.
*/
virtual void
open(
bool createIfMissing,
boost::optional<uint64_t> appType,
boost::optional<uint64_t> uid,
boost::optional<uint64_t> salt)
{
Throw<std::runtime_error>(std::string(
"Deterministic appType/uid/salt not supported by backend " +
getName()));
}

/** Close the backend.
This allows the caller to catch exceptions.
*/
Expand Down
103 changes: 103 additions & 0 deletions src/ripple/nodestore/DeterministicShard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Deterministic Database Shards

This doc describes the standard way to assemble the database shard. A shard assembled using this approach becomes deterministic i.e. if two independent sides assemble the shard consists of the same ledgers, accounts and transactions, then they will obtain the same shard files `nudb.dat` and `nudb.key`. The approach deals with the `NuDB` database format only, refer to `https:/vinniefalco/NuDB`.


## Headers

Due to NuDB database definition, the following headers are used for database files:

nudb.key:
```
char[8] Type The characters "nudb.key"
uint16 Version Holds the version number
uint64 UID Unique ID generated on creation
uint64 Appnum Application defined constant
uint16 KeySize Key size in bytes
uint64 Salt A random seed
uint64 Pepper The salt hashed
uint16 BlockSize Size of a file block in bytes
uint16 LoadFactor Target fraction in 65536ths
uint8[56] Reserved Zeroes
uint8[] Reserved Zero-pad to block size
```

nudb.dat:
```
char[8] Type The characters "nudb.dat"
uint16 Version Holds the version number
uint64 UID Unique ID generated on creation
uint64 Appnum Application defined constant
uint16 KeySize Key size in bytes
uint8[64] (reserved) Zeroes
```
there all fields are saved using network byte order (most significant byte first).

To make the shard deterministic the following parameters are used as values of header field both for `nudb.key` and `nudb.dat` files.
```
Version 2
UID digest(0)
Appnum digest(2) | 0x5348524400000000 /* 'SHRD' */
KeySize 32
Salt digest(1)
Pepper XXH64(Salt)
BlockSize 0x1000 (4096 bytes)
LoadFactor 0.5 (numeric 0x8000)
```
Note: XXH64() is well-known hash algorithm.

The `digest(i)` mentioned above defined as the follows:

First, RIPEMD160 hash `H` calculated of the following structure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment that all 32-bit integers are hashed in network byte order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

```
uint256 lastHash Hash of last ledger in shard
uint32 index Index of the shard
uint32 firstSeq Sequence number of first ledger in the shard
uint32 lastSeq Sequence number of last ledger in the shard
uint32 version Version of shard, 2 at the present
```
there all 32-bit integers are hashed in network byte order.

Then, `digest(i)` is defined as the following portion of the above hash `H`:
```
digest(0) = H[0] << 56 | H[2] << 48 | ... | H[14] << 0,
digest(1) = H[1] << 56 | H[3] << 48 | ... | H[15] << 0,
digest(2) = H[19] << 24 | H[18] << 16 | ... | H[16] << 0,
```
where `H[i]` denotes `i`-th byte of hash `H`.


Copy link
Contributor

@nbougalis nbougalis Apr 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First question, and this is more theoretical: should we modify NuDB to increase the size of the headers of the key and data files so that they both are precisely 512 bytes, increase the size of some fields (specifically the UID field that should be 256 bits) and allocate a portion of the header as an area of the database owner to store his own metadata?

Now on to more practical matters: I really don't like the way that we're taking a cryptographic hash function, chopping its output up and manipulating it.

If we make the above changes to the NuDB header structure, we can keep using SHA256 and just set UID to the hash, while storing some metadata in the reserved area.

If we can't do that, then rather than chopping amount the recombining the hash, I recommend the following: replace SHA256 with RIPEMD-160 and use it to calculate H over the following data:

uint256         lastHash        Hash of last ledger in shard
uint32          index           Index of the shard
uint32          firstSeq        Sequence number of first ledger in the shard
uint32          lastSeq         Sequence number of last ledger in the shard
uint32          version         Version of shard, 2 at the present

Crack the 160 bit hash H apart into 2 64-bit values A and B and 1 32-bit value C and set:

  1. UID := A;
  2. SALT := B; and
  3. APPNUM := C | 0x5348524400000000. (0x53585244 == 'S' 'H' 'R' 'D').

This keeps the full 160-bit hash.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saving hash of shard as UID/salt was not required by this ticket. It is my proposition. Ticket recommends to set UID/salt to known constant + shard_index. Your proposition of using RIPEMD-160 instead of SHA256 is better. I think that changing NuDB standard is much more difficult because current standard is used is main net if I am not mistaking.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

## Contents

After deterministic shard is created using the above mentioned headers, it filled with objects. First, all objects of the shard are collected and sorted in according to their hashes. Here the objects are: ledgers, SHAmap tree nodes including accounts and transactions, and final key object with hash 0. Objects are sorted by increasing of their hashes, precisely, by increasing of hex representations of hashes in lexicographic order.

For example, the following is an example of sorted hashes in their hex representation:
```
0000000000000000000000000000000000000000000000000000000000000000
154F29A919B30F50443A241C466691B046677C923EE7905AB97A4DBE8A5C2423
2231553FC01D37A66C61BBEEACBB8C460994493E5659D118E19A8DDBB1444273
272DCBFD8E4D5D786CF11A5444B30FB35435933B5DE6C660AA46E68CF0F5C447
3C062FD9F0BCDCA31ACEBCD8E530D0BDAD1F1D1257B89C435616506A3EE6CB9E
58A0E5AE427CDDC1C7C06448E8C3E4BF718DE036D827881624B20465C3E1334F
...
```

Finally, objects added to the shard one by one in the sorted order from low to high hashes.


## Tests

To perform test to deterministic shards implementation one can enter the following command:
```
rippled --unittest ripple.NodeStore.DatabaseShard
```

The following is the right output of deterministic shards test:
```
ripple.NodeStore.DatabaseShard DatabaseShard deterministic_shard with backend nudb
Iteration 0: RIPEMD160[nudb.key] = 4CFA8985836B549EC99D2E9705707F488DC91E4E
Iteration 0: RIPEMD160[nudb.dat] = 8CC61F503C36339803F8C2FC652C1102DDB889F1
Iteration 1: RIPEMD160[nudb.key] = 4CFA8985836B549EC99D2E9705707F488DC91E4E
Iteration 1: RIPEMD160[nudb.dat] = 8CC61F503C36339803F8C2FC652C1102DDB889F1
```

35 changes: 30 additions & 5 deletions src/ripple/nodestore/backend/NuDBFactory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,10 @@ namespace NodeStore {
class NuDBBackend : public Backend
{
public:
static constexpr std::size_t currentType = 1;
static constexpr std::uint64_t currentType = 1;
static constexpr std::uint64_t deterministicType = 0x5348524400000000ull;
/* "SHRD" in ASCII */
static constexpr std::uint64_t deterministicMask = 0xFFFFFFFF00000000ull;

beast::Journal const j_;
size_t const keyBytes_;
Expand Down Expand Up @@ -93,7 +96,11 @@ class NuDBBackend : public Backend
}

void
open(bool createIfMissing) override
open(
bool createIfMissing,
boost::optional<uint64_t> appType,
boost::optional<uint64_t> uid,
boost::optional<uint64_t> salt) override
{
using namespace boost::filesystem;
if (db_.is_open())
Expand All @@ -114,8 +121,9 @@ class NuDBBackend : public Backend
dp,
kp,
lp,
currentType,
nudb::make_salt(),
appType.value_or(currentType),
uid.value_or(nudb::make_uid()),
salt.value_or(nudb::make_salt()),
keyBytes_,
nudb::block_size(kp),
0.50,
Expand All @@ -128,10 +136,27 @@ class NuDBBackend : public Backend
db_.open(dp, kp, lp, ec);
if (ec)
Throw<nudb::system_error>(ec);
if (db_.appnum() != currentType)

/** Old value currentType is accepted for appnum in traditional
* databases, new value is used for deterministic shard databases.
* New 64-bit value is constructed from fixed and random parts.
* Fixed part is bounded by bitmask deterministicMask,
* and the value of fixed part is deterministicType.
* Random part depends on the contents of the shard and may be any.
* The contents of appnum field should match either old or new rule.
*/
if (db_.appnum() != appType.value_or(currentType) &&
(appType ||
(db_.appnum() & deterministicMask) != deterministicType))
pwang200 marked this conversation as resolved.
Show resolved Hide resolved
Throw<std::runtime_error>("nodestore: unknown appnum");
}

void
open(bool createIfMissing) override
{
open(createIfMissing, boost::none, boost::none, boost::none);
}

void
close() override
{
Expand Down
Loading