Skip to content

Commit

Permalink
improve doc and cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
yihuang committed Jan 30, 2023
1 parent 37d899f commit f75c310
Show file tree
Hide file tree
Showing 3 changed files with 71 additions and 23 deletions.
61 changes: 51 additions & 10 deletions versiondb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ streamers = ["versiondb"]

On startup, the node will create a `StreamingService` to subscribe to latest state changes in realtime and save them to versiondb, the db instance is placed at `$NODE_HOME/data/versiondb` directory, there's no way to customize the db path currently. It'll also switch grpc query service's backing store to versiondb from iavl tree, you should migrate the legacy states in advance to make the transition smooth, otherwise, the grpc queries can't see the legacy versions.

If the versiondb is not empty and it's latest version doesn't match the multistore's last committed version, the startup will fail with error message `"versiondb lastest version %d doesn't match iavl latest version %d"`, that's to avoid creating gaps in versiondb accidentally. When this error happens, you just need to update versiondb to the latest version in iavl tree manually.
If the versiondb is not empty and it's latest version doesn't match the multistore's last committed version, the startup will fail with error message `"versiondb lastest version %d doesn't match iavl latest version %d"`, that's to avoid creating gaps in versiondb accidentally. When this error happens, you just need to update versiondb to the latest version in iavl tree manually (see [](#catch-up-with-iavl-tree)).

## Migration

Expand All @@ -32,8 +32,9 @@ The legacy state migration process is done in two main steps:

### Extract Change Sets

```
$ cronosd changeset dump --home /chain/.cronosd/ --output data distribution acc authz bank capability cronos evidence evm feegrant feeibc feemarket gov ibc mint params slashing staking transfer upgrade
```bash
$ export STORES="distribution acc authz bank capability cronos evidence evm feegrant feeibc feemarket gov ibc mint params slashing staking transfer upgrade"
$ cronosd changeset dump --home /chain/.cronosd/ --output data $STORES
```

`dump` command will extract the change sets from the IAVL tree, and store each store in separate directories. The change set files are segmented into different chunks and compressed with zlib level 6 by default, the chunk size defaults to 1m blocks, the result `data` directly will look like this:
Expand All @@ -51,9 +52,11 @@ data/authz/block-2000000.zz

Extraction is the slowest step, the test run on testnet archive node takes around 11 hours on a 8core ssd machine, but fortunately, the change set files can be verified pretty fast, so they can be share on CDN in a trustless manner, normal users should just download them from CDN and verify the correctness locally, should be much faster than extract by yourself.

For rocksdb backend, `dump` command opens the db in readonly mode, it can run on live node's db, but goleveldb backend don't support this.

#### Verify Change Sets

```
```bash
$ cronosd changeset verify data/acc/*.zz
7130689 8DF52D6F7A7690916894AF67B07D64B678FB686626B2B3109813BBE172E74F08
```
Expand All @@ -64,7 +67,7 @@ $ cronosd changeset verify data/acc/*.zz
`verify` command takes several minutes and several gigabytes of ram to run, if ram usage is a problem, it can also run incrementally, you can export the snapshot for a middle version, then verify the remaining versions start from that snapshot:

```
```bash
$ cronosd changeset verify --save-snapshot /tmp/snapshot data/acc/block-0.zz data/acc/block-1000000.zz data/acc/block-2000000.zz
$ cronosd changeset verify --load-snapshot /tmp/snapshot data/acc/block-3000000.zz data/acc/block-4000000.zz data/acc/block-5000000.zz
```
Expand All @@ -75,13 +78,20 @@ The format of change set files are documented [here](memiavl/README.md#change-se

#### SST File Writing

To maximize the speed of initial data ingestion into rocksdb, we take advantage of the sst file writer in rocksdb, with that we can write out sst files directly without causing contention on a shared database, the sst files for each store can be written out in parallel. We also developed an external sorting algorithm to sort the data before writing the sst files, so the sst files don't cause extra compaction in the final db.
To maximize the speed of initial data ingestion into rocksdb, we take advantage of the sst file writer in rocksdb, with that we can write out sst files directly without causing contention on a shared database, the sst files for each store can be written out in parallel. We also developed an external sorting algorithm to sort the data before writing the sst files, so the sst files don't have overlaps and can be ingested into the bottom-most level in db.

```
```bash
$ # convert a single store
$ cronosd changeset convert-to-sst --store distribution ./sst/distribution.sst data/distribution/*.zz

$ # convert all stores sequentially
$ for store in $STORES
> do
> cronosd changeset convert-to-sst --store $store ./sst/$store.sst data/$store/*.zz
> done
```

`convert-to-sst` will do the sst file writing for a single store, you can wrap it in a simple script to run multiple stores in parallel. Here's an example Python script:
You can also wrap it in a simple script to run multiple stores in parallel. Here's an example Python script:

```python
import os
Expand Down Expand Up @@ -132,11 +142,42 @@ The provided python script can finish in around 20minutes for testnet archive no

Finally, we can ingest the generated sst files into final versiondb:

```
```bash
$ cronosd changeset ingest-sst /chain/.cronosd/data/versiondb/ sst/*.sst --maximum-version 7130689 --move-files
```

This command takes around 1 second to finish, `--move-files` will move the sst files instead of copy them, `--maximum-version` specifies the maximum version in the change sets, it'll override the existing latest version if it's bigger,
the sst files will be put at bottom-most level possible, because the generation step make sure there's no key overlap between them.

[^1]: https:/facebook/rocksdb/wiki/User-defined-Timestamp-%28Experimental%29
#### Catch Up With IAVL Tree

If an non-empty versiondb lags behind from the current IAVL tree, the node will refuse to startup, in this case user need to manually sync them, the steps are quite similar to the migration process since genesis:

- Stop the node so it don't process new blocks.

- Dump change sets for the block range between the latest version in versiondb and iavl tree, just specify the `--start-version` parameter to versiondb's latest version plus one:

```bash
$ cronosd changeset dump --home /chain/.cronosd/ --output /tmp/data --start-version 7241675 $STORES
```

- Feed the change set to versiondb, here we can skip the sst file writer generation, that is for parallel processing of big amount of change sets, since we are dealing with much smaller amount here, this one should be fast enough:

```bash
$ for store in $STORES
> do
> ./result/bin/cronosd changeset to-versiondb /chain/.cronosd/data/versiondb data2/$store/*.zz --store $store
> done
```

- Finally, use `ingest-sst` command to update the lastest version, just don't pass any sst files:
```bash
$ cronosd changeset ingest-sst /chain/.cronosd/data/versiondb/ --maximum-version 7300922
```
Of course, you can always follow the sst file generation and ingestion process if the data amount if big.
[^1]: https:/facebook/rocksdb/wiki/User-defined-Timestamp-%28Experimental%29
6 changes: 3 additions & 3 deletions versiondb/client/ingest_sst.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ import (

func IngestSSTCmd() *cobra.Command {
cmd := &cobra.Command{
Use: "ingest-sst db-path file1.sst file2.sst ...",
Short: "Ingest sst files into versiondb",
Args: cobra.MinimumNArgs(2),
Use: "ingest-sst db-path [file1.sst file2.sst ...]",
Short: "Ingest sst files into versiondb and set latest version",
Args: cobra.MinimumNArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
dbPath := args[0]
moveFiles, err := cmd.Flags().GetBool(flagMoveFiles)
Expand Down
27 changes: 17 additions & 10 deletions versiondb/tsrocksdb/store_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,39 +17,46 @@ func TestTSVersionDB(t *testing.T) {
})
}

func TestDebug(t *testing.T) {
store, err := NewStore(t.TempDir())
func TestUserTimestamp(t *testing.T) {
db, cfHandle, err := OpenVersionDB(t.TempDir())
require.NoError(t, err)

var ts [8]byte
binary.LittleEndian.PutUint64(ts[:], 1000)

err = store.db.PutCFWithTS(grocksdb.NewDefaultWriteOptions(), store.cfHandle, []byte("hello"), ts[:], []byte{1})
err = db.PutCFWithTS(grocksdb.NewDefaultWriteOptions(), cfHandle, []byte("hello"), ts[:], []byte{1})
require.NoError(t, err)
err = db.PutCFWithTS(grocksdb.NewDefaultWriteOptions(), cfHandle, []byte("zempty"), ts[:], []byte{})
require.NoError(t, err)

v := int64(999)
bz, err := store.db.GetCF(newTSReadOptions(&v), store.cfHandle, []byte("hello"))
defer bz.Free()
bz, err := db.GetCF(newTSReadOptions(&v), cfHandle, []byte("hello"))
require.NoError(t, err)
require.False(t, bz.Exists())
bz.Free()

bz, err = store.db.GetCF(newTSReadOptions(nil), store.cfHandle, []byte("hello"))
defer bz.Free()
bz, err = db.GetCF(newTSReadOptions(nil), cfHandle, []byte("hello"))
require.NoError(t, err)
require.Equal(t, []byte{1}, bz.Data())
bz.Free()

v = int64(1000)
it := store.db.NewIteratorCF(newTSReadOptions(&v), store.cfHandle)
it := db.NewIteratorCF(newTSReadOptions(&v), cfHandle)
it.SeekToFirst()
require.True(t, it.Valid())
require.Equal(t, []byte("hello"), it.Key().Data())

bz, err = db.GetCF(newTSReadOptions(&v), cfHandle, []byte("zempty"))
require.NoError(t, err)
require.Equal(t, []byte{}, bz.Data())
bz.Free()

binary.LittleEndian.PutUint64(ts[:], 1002)
err = store.db.PutCFWithTS(grocksdb.NewDefaultWriteOptions(), store.cfHandle, []byte("hella"), ts[:], []byte{2})
err = db.PutCFWithTS(grocksdb.NewDefaultWriteOptions(), cfHandle, []byte("hella"), ts[:], []byte{2})
require.NoError(t, err)

v = int64(1002)
it = store.db.NewIteratorCF(newTSReadOptions(&v), store.cfHandle)
it = db.NewIteratorCF(newTSReadOptions(&v), cfHandle)
it.SeekToFirst()
require.True(t, it.Valid())
require.Equal(t, []byte("hella"), it.Key().Data())
Expand Down

0 comments on commit f75c310

Please sign in to comment.