From 60588afc9eaae23d9e8da3996f2fe7e101b5721d Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 20 Sep 2024 15:40:45 +0200 Subject: [PATCH] docs(config): improve profile descriptions (#10517) * docs(config): improve profile descriptions * chore: move profiles and types to the end --- config/profile.go | 15 +- docs/config.md | 399 ++++++++++++++++++++++++--------------------- docs/datastores.md | 6 + 3 files changed, 224 insertions(+), 196 deletions(-) diff --git a/config/profile.go b/config/profile.go index c73b05af27c..63e76d8ec1b 100644 --- a/config/profile.go +++ b/config/profile.go @@ -124,7 +124,7 @@ This profile may only be applied when first initializing the node. "flatfs": { Description: `Configures the node to use the flatfs datastore. -This is the most battle-tested and reliable datastore. +This is the most battle-tested and reliable datastore. You should use this datastore if: * You need a very simple and very reliable datastore, and you trust your @@ -145,21 +145,20 @@ This profile may only be applied when first initializing the node. }, }, "badgerds": { - Description: `Configures the node to use the experimental badger datastore. + Description: `Configures the node to use the legacy badgerv1 datastore. -Use this datastore if some aspects of performance, -especially the speed of adding many gigabytes of files, are critical. -However, be aware that: +NOTE: this is badger 1.x, which has known bugs and is no longer supported by the upstream team. +It is provided here only for pre-existing users, allowing them to migrate away to more modern datastore. + +Other caveats: * This datastore will not properly reclaim space when your datastore is smaller than several gigabytes. If you run IPFS with --enable-gc, you plan on storing very little data in your IPFS node, and disk usage is more critical than performance, consider using flatfs. -* This datastore uses up to several gigabytes of memory. +* This datastore uses up to several gigabytes of memory. * Good for medium-size datastores, but may run into performance issues if your dataset is bigger than a terabyte. -* The current implementation is based on old badger 1.x - which is no longer supported by the upstream team. This profile may only be applied when first initializing the node.`, diff --git a/docs/config.md b/docs/config.md index 107050927b7..26f712d3a05 100644 --- a/docs/config.md +++ b/docs/config.md @@ -9,16 +9,6 @@ config file at runtime. - [The Kubo config file](#the-kubo-config-file) - [Table of Contents](#table-of-contents) - - [Profiles](#profiles) - - [Types](#types) - - [`flag`](#flag) - - [`priority`](#priority) - - [`strings`](#strings) - - [`duration`](#duration) - - [`optionalInteger`](#optionalinteger) - - [`optionalBytes`](#optionalbytes) - - [`optionalString`](#optionalstring) - - [`optionalDuration`](#optionalduration) - [`Addresses`](#addresses) - [`Addresses.API`](#addressesapi) - [`Addresses.Gateway`](#addressesgateway) @@ -184,184 +174,26 @@ config file at runtime. - [`Version.AgentSuffix`](#versionagentsuffix) - [`Version.SwarmCheckEnabled`](#versionswarmcheckenabled) - [`Version.SwarmCheckPercentThreshold`](#versionswarmcheckpercentthreshold) - -## Profiles - -Configuration profiles allow to tweak configuration quickly. Profiles can be -applied with the `--profile` flag to `ipfs init` or with the `ipfs config profile -apply` command. When a profile is applied a backup of the configuration file -will be created in `$IPFS_PATH`. - -The available configuration profiles are listed below. You can also find them -documented in `ipfs config profile --help`. - -- `server` - - Disables local host discovery, recommended when - running IPFS on machines with public IPv4 addresses. - -- `randomports` - - Use a random port number for the incoming swarm connections. - -- `default-datastore` - - Configures the node to use the default datastore (flatfs). - - Read the "flatfs" profile description for more information on this datastore. - - This profile may only be applied when first initializing the node. - -- `local-discovery` - - Enables local discovery (enabled by default). Useful to re-enable local discovery after it's - disabled by another profile (e.g., the server profile). - -- `test` - - Reduces external interference of IPFS daemon, this - is useful when using the daemon in test environments. - -- `default-networking` - - Restores default network settings. - Inverse profile of the test profile. - -- `flatfs` - - Configures the node to use the flatfs datastore. Flatfs is the default datastore. - - This is the most battle-tested and reliable datastore. - You should use this datastore if: - - - You need a very simple and very reliable datastore, and you trust your - filesystem. This datastore stores each block as a separate file in the - underlying filesystem so it's unlikely to lose data unless there's an issue - with the underlying file system. - - You need to run garbage collection in a way that reclaims free space as soon as possible. - - You want to minimize memory usage. - - You are ok with the default speed of data import, or prefer to use `--nocopy`. - - This profile may only be applied when first initializing the node. - - -- `badgerds` - - Configures the node to use the experimental badger datastore. Keep in mind that this **uses an outdated badger 1.x**. - - Use this datastore if some aspects of performance, - especially the speed of adding many gigabytes of files, are critical. However, be aware that: - - - This datastore will not properly reclaim space when your datastore is - smaller than several gigabytes. If you run IPFS with `--enable-gc`, you plan on storing very little data in - your IPFS node, and disk usage is more critical than performance, consider using - `flatfs`. - - This datastore uses up to several gigabytes of memory. - - Good for medium-size datastores, but may run into performance issues if your dataset is bigger than a terabyte. - - The current implementation is based on old badger 1.x which is no longer supported by the upstream team. - - This profile may only be applied when first initializing the node. - -- `lowpower` - - Reduces daemon overhead on the system. Affects node - functionality - performance of content discovery and data - fetching may be degraded. Local data won't be announced on routing systems like Amino DHT. - - - `Swarm.ConnMgr` set to maintain minimum number of p2p connections at a time. - - Disables [`Reprovider`](#reprovider) service → no CID will be announced on Amino DHT and other routing systems(!) - - Disables AutoNAT. - - Use this profile with caution. - -- `legacy-cid-v0` - - Makes UnixFS import (`ipfs add`) produce legacy CIDv0 with no raw leaves, sha2-256 and 256 KiB chunks. - - > [!WARNING] - > This profile is provided for legacy users and should not be used for new projects. - -- `test-cid-v1` - - Makes UnixFS import (`ipfs add`) produce modern CIDv1 with raw leaves, sha2-256 and 1 MiB chunks. - - > [!NOTE] - > This profile will become the new implicit default, provided for testing purposes. - > Follow [kubo#4143](https://github.com/ipfs/kubo/issues/4143) for more details. - -## Types - -This document refers to the standard JSON types (e.g., `null`, `string`, -`number`, etc.), as well as a few custom types, described below. - -### `flag` - -Flags allow enabling and disabling features. However, unlike simple booleans, -they can also be `null` (or omitted) to indicate that the default value should -be chosen. This makes it easier for Kubo to change the defaults in the -future unless the user _explicitly_ sets the flag to either `true` (enabled) or -`false` (disabled). Flags have three possible states: - -- `null` or missing (apply the default value). -- `true` (enabled) -- `false` (disabled) - -### `priority` - -Priorities allow specifying the priority of a feature/protocol and disabling the -feature/protocol. Priorities can take one of the following values: - -- `null`/missing (apply the default priority, same as with flags) -- `false` (disabled) -- `1 - 2^63` (priority, lower is preferred) - -### `strings` - -Strings is a special type for conveniently specifying a single string, an array -of strings, or null: - -- `null` -- `"a single string"` -- `["an", "array", "of", "strings"]` - -### `duration` - -Duration is a type for describing lengths of time, using the same format go -does (e.g, `"1d2h4m40.01s"`). - -### `optionalInteger` - -Optional integers allow specifying some numerical value which has -an implicit default when missing from the config file: - -- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault(value)`) -- an integer between `-2^63` and `2^63-1` (i.e. `-9223372036854775808` to `9223372036854775807`) - -### `optionalBytes` - -Optional Bytes allow specifying some number of bytes which has -an implicit default when missing from the config file: - -- `null`/missing (apply the default value defined in Kubo sources) -- a string value indicating the number of bytes, including human readable representations: - - [SI sizes](https://en.wikipedia.org/wiki/Metric_prefix#List_of_SI_prefixes) (metric units, powers of 1000), e.g. `1B`, `2kB`, `3MB`, `4GB`, `5TB`, …) - - [IEC sizes](https://en.wikipedia.org/wiki/Binary_prefix#IEC_prefixes) (binary units, powers of 1024), e.g. `1B`, `2KiB`, `3MiB`, `4GiB`, `5TiB`, …) - -### `optionalString` - -Optional strings allow specifying some string value which has -an implicit default when missing from the config file: - -- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault("value")`) -- a string - -### `optionalDuration` - -Optional durations allow specifying some duration value which has -an implicit default when missing from the config file: - -- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault("1h2m3s")`) -- a string with a valid [go duration](#duration) (e.g, `"1d2h4m40.01s"`). + - [Profiles](#profiles) + - [`server` profile](#server-profile) + - [`randomports` profile](#randomports-profile) + - [`default-datastore` profile](#default-datastore-profile) + - [`local-discovery` profile](#local-discovery-profile) + - [`default-networking` profile](#default-networking-profile) + - [`flatfs` profile](#flatfs-profile) + - [`badgerds` profile](#badgerds-profile) + - [`lowpower` profile](#lowpower-profile) + - [`legacy-cid-v0` profile](#legacy-cid-v0-profile) + - [`test-cid-v1` profile](#test-cid-v1-profile) + - [Types](#types) + - [`flag`](#flag) + - [`priority`](#priority) + - [`strings`](#strings) + - [`duration`](#duration) + - [`optionalInteger`](#optionalinteger) + - [`optionalBytes`](#optionalbytes) + - [`optionalString`](#optionalstring) + - [`optionalDuration`](#optionalduration) ## `Addresses` @@ -2491,3 +2323,194 @@ trigger update warning. Default: `5` Type: `optionalInteger` (1-100) + +## Profiles + +Configuration profiles allow to tweak configuration quickly. Profiles can be +applied with the `--profile` flag to `ipfs init` or with the `ipfs config profile +apply` command. When a profile is applied a backup of the configuration file +will be created in `$IPFS_PATH`. + +Configuration profiles can be applied additively. For example, both the `test-cid-v1` and `lowpower` profiles can be applied one after the other. +The available configuration profiles are listed below. You can also find them +documented in `ipfs config profile --help`. + +### `server` profile + +Disables local [`Discovery.MDNS`](#discoverymdns) and blocks connections to +IPv4 and IPv6 prefixes that are [private, local only, or unrouteable](https://github.com/ipfs/kubo/blob/b71cf0d15904bdef21fe2eee5f1118a274309a4d/config/profile.go#L24-L43). + +Recommended when running IPFS on machines with public IPv4 addresses +at providers that interpret local IPFS discovery and traffic as netscan abuse ([example](https://github.com/ipfs/kubo/issues/10327)). + +### `randomports` profile + +Use a random port number for the incoming swarm connections. +Used for testing. + +### `default-datastore` profile + +Configures the node to use the default datastore (flatfs). + +Read the "flatfs" profile description for more information on this datastore. + +This profile may only be applied when first initializing the node. + +### `local-discovery` profile + +Enables local [`Discovery.MDNS`](#discoverymdns) (enabled by default). + +Useful to re-enable local discovery after it's disabled by another profile +(e.g., the server profile). + +`test` profile + +Reduces external interference of IPFS daemon, this +is useful when using the daemon in test environments. + +### `default-networking` profile + +Restores default network settings. +Inverse profile of the test profile. + +### `flatfs` profile + +Configures the node to use the flatfs datastore. Flatfs is the default datastore. + +This is the most battle-tested and reliable datastore. +You should use this datastore if: + +- You need a very simple and very reliable datastore, and you trust your + filesystem. This datastore stores each block as a separate file in the + underlying filesystem so it's unlikely to lose data unless there's an issue + with the underlying file system. +- You need to run garbage collection in a way that reclaims free space as soon as possible. +- You want to minimize memory usage. +- You are ok with the default speed of data import, or prefer to use `--nocopy`. + +This profile may only be applied when first initializing the node. + +### `badgerds` profile + +Configures the node to use the legacy badgerv1 datastore. + +> [!CAUTION] +> This is based on very old badger 1.x, which has known bugs and is no longer supported by the upstream team. +> It is provided here only for pre-existing users, allowing them to migrate away to more modern datastore. +> Do not use it for new deployments, unless you really, really know what you are doing. + +Also, be aware that: + +- This datastore will not properly reclaim space when your datastore is + smaller than several gigabytes. If you run IPFS with `--enable-gc`, you plan on storing very little data in + your IPFS node, and disk usage is more critical than performance, consider using + `flatfs`. +- This datastore uses up to several gigabytes of memory. +- Good for medium-size datastores, but may run into performance issues if your dataset is bigger than a terabyte. +- The current implementation is based on old badger 1.x which is no longer supported by the upstream team. + +This profile may only be applied when first initializing the node. + +### `lowpower` profile + +Reduces daemon overhead on the system. Affects node +functionality - performance of content discovery and data +fetching may be degraded. + +> [!CAUTION] +> Local data won't be announced on routing systems like Amino DHT. + +- `Swarm.ConnMgr` set to maintain minimum number of p2p connections at a time. +- Disables [`Reprovider`](#reprovider) service → no CID will be announced on Amino DHT and other routing systems(!) +- Disables [`AutoNAT`](#autonat). + +Use this profile with caution. + +### `legacy-cid-v0` profile + +Makes UnixFS import (`ipfs add`) produce legacy CIDv0 with no raw leaves, sha2-256 and 256 KiB chunks. + +> [!NOTE] +> This profile is provided for legacy users and should not be used for new projects. + +### `test-cid-v1` profile + +Makes UnixFS import (`ipfs add`) produce modern CIDv1 with raw leaves, sha2-256 and 1 MiB chunks. + +> [!NOTE] +> This profile will become the new implicit default, provided for testing purposes. +> Follow [kubo#4143](https://github.com/ipfs/kubo/issues/4143) for more details. + +## Types + +This document refers to the standard JSON types (e.g., `null`, `string`, +`number`, etc.), as well as a few custom types, described below. + +### `flag` + +Flags allow enabling and disabling features. However, unlike simple booleans, +they can also be `null` (or omitted) to indicate that the default value should +be chosen. This makes it easier for Kubo to change the defaults in the +future unless the user _explicitly_ sets the flag to either `true` (enabled) or +`false` (disabled). Flags have three possible states: + +- `null` or missing (apply the default value). +- `true` (enabled) +- `false` (disabled) + +### `priority` + +Priorities allow specifying the priority of a feature/protocol and disabling the +feature/protocol. Priorities can take one of the following values: + +- `null`/missing (apply the default priority, same as with flags) +- `false` (disabled) +- `1 - 2^63` (priority, lower is preferred) + +### `strings` + +Strings is a special type for conveniently specifying a single string, an array +of strings, or null: + +- `null` +- `"a single string"` +- `["an", "array", "of", "strings"]` + +### `duration` + +Duration is a type for describing lengths of time, using the same format go +does (e.g, `"1d2h4m40.01s"`). + +### `optionalInteger` + +Optional integers allow specifying some numerical value which has +an implicit default when missing from the config file: + +- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault(value)`) +- an integer between `-2^63` and `2^63-1` (i.e. `-9223372036854775808` to `9223372036854775807`) + +### `optionalBytes` + +Optional Bytes allow specifying some number of bytes which has +an implicit default when missing from the config file: + +- `null`/missing (apply the default value defined in Kubo sources) +- a string value indicating the number of bytes, including human readable representations: + - [SI sizes](https://en.wikipedia.org/wiki/Metric_prefix#List_of_SI_prefixes) (metric units, powers of 1000), e.g. `1B`, `2kB`, `3MB`, `4GB`, `5TB`, …) + - [IEC sizes](https://en.wikipedia.org/wiki/Binary_prefix#IEC_prefixes) (binary units, powers of 1024), e.g. `1B`, `2KiB`, `3MiB`, `4GiB`, `5TiB`, …) + +### `optionalString` + +Optional strings allow specifying some string value which has +an implicit default when missing from the config file: + +- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault("value")`) +- a string + +### `optionalDuration` + +Optional durations allow specifying some duration value which has +an implicit default when missing from the config file: + +- `null`/missing will apply the default value defined in Kubo sources (`.WithDefault("1h2m3s")`) +- a string with a valid [go duration](#duration) (e.g, `"1d2h4m40.01s"`). diff --git a/docs/datastores.md b/docs/datastores.md index ccedce4c715..c18ecb0c7c5 100644 --- a/docs/datastores.md +++ b/docs/datastores.md @@ -39,6 +39,12 @@ Uses a leveldb database to store key value pairs. Uses [badger](https://github.com/dgraph-io/badger) as a key value store. +> [!CAUTION] +> This is based on very old badger 1.x, which has known bugs and is no longer supported by the upstream team. +> It is provided here only for pre-existing users, allowing them to migrate away to more modern datastore. +> Do not use it for new deployments, unless you really, really know what you are doing. + + * `syncWrites`: Flush every write to disk before continuing. Setting this to false is safe as kubo will automatically flush writes to disk before and after performing critical operations like pinning. However, you can set this to true to be extra-safe (at the cost of a 2-3x slowdown when adding files). * `truncate`: Truncate the DB if a partially written sector is found (defaults to true). There is no good reason to set this to false unless you want to manually recover partially written (and unpinned) blocks if kubo crashes half-way through adding a file.