New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeCamp #147 [Doc] Add Chinese version of train & test tutorial #2355

Merged

MeowZheng merged 5 commits into open-mmlab:dev-1.x from BLUE-coconut:master

Dec 12, 2022

Contributor

BLUE-coconut commented Nov 28, 2022

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
The documentation has been modified accordingly, like docstring or example tutorials.

mm-assistant bot commented Nov 28, 2022

We recommend using English or English & Chinese for pull requests so that we could have broader discussion.

mm-assistant bot assigned MengzhangLI

CLAassistant commented Nov 28, 2022 •

edited

Loading

All committers have signed the CLA.

xiexinch self-assigned this

xiexinch changed the title ~~英译中，完成4_train_test.md中文教程文档~~ [Doc] Add Chinese version of train & test tutorial

MeowZheng requested a review from xiexinch

November 29, 2022 14:34

doc

f2b8d2f

BLUE-coconut force-pushed the master branch from 4a3ce72 to f2b8d2f Compare

November 30, 2022 12:46

xiexinch changed the base branch from master to dev-1.x

November 30, 2022 12:47

xiexinch reviewed

View reviewed changes

docs/zh_cn/user_guides/4_train_test.md Outdated

		@@ -0,0 +1,223 @@
		# 教程4：使用现有模型进行训练和测试

		MMsegmentation 支持在多种设备上训练和测试模型。如下文，具体方式分别为单GPU、分布式、族群式的训练和测试。通过本教程，你将知晓如何用MMsegmentation提供的脚本进行训练和测试。

Collaborator

xiexinch Dec 2, 2022

Suggested change

MMsegmentation 支持在多种设备上训练和测试模型。如下文，具体方式分别为单GPU、分布式、族群式的训练和测试。通过本教程，你将知晓如何用MMsegmentation提供的脚本进行训练和测试。

MMsegmentation 支持在多种设备上训练和测试模型。如下文，具体方式分别为单GPU、分布式以及计算集群的训练和测试。通过本教程，你将知晓如何用 MMsegmentation 提供的脚本进行训练和测试。

docs/zh_cn/user_guides/4_train_test.md Outdated

+- `--work-dir ${工作路径}`: 重新指定工作路径
+- `--amp`: 使用自动混合精度计算
+- `--resume`: 从工作路径中调用保存的最新的模型权重文件（checkpoint）

Collaborator

xiexinch Dec 2, 2022

Suggested change

- `--resume`: 从工作路径中调用保存的最新的模型权重文件（checkpoint）

- `--resume`: 从工作路径中保存的最新检查点文件（checkpoint）恢复训练

docs/zh_cn/user_guides/4_train_test.md Outdated

+- `--work-dir ${工作路径}`: 重新指定工作路径
+- `--amp`: 使用自动混合精度计算
+- `--resume`: 从工作路径中调用保存的最新的模型权重文件（checkpoint）
+- `--cfg-options ${需更新的具体配置}`: 覆盖已载入的配置中的部分设置，并且 以 xxx=yyy 格式的键值对 将被合并到config配置文件中。

Collaborator

xiexinch Dec 2, 2022

Suggested change

- `--cfg-options ${需更新的具体配置}`: 覆盖已载入的配置中的部分设置，并且 以 xxx=yyy 格式的键值对 将被合并到config配置文件中。

- `--cfg-options ${需更覆盖的配置}`: 覆盖已载入的配置中的部分设置，并且 以 xxx=yyy 格式的键值对 将被合并到 config 配置文件中。

docs/zh_cn/user_guides/4_train_test.md Outdated


		下面是对于多GPU测试的可选参数:

		- `--launcher`: 用来分布式任务初始化运载器。允许选择的参数值有 `none`, `pytorch`, `slurm`, `mpi`。特别的，如果设置为none，测试将非分布式模式下进行。

Collaborator

xiexinch Dec 2, 2022

Suggested change

- `--launcher`: 用来分布式任务初始化运载器。允许选择的参数值有 `none`, `pytorch`, `slurm`, `mpi`。特别的，如果设置为none，测试将非分布式模式下进行。

- `--launcher`: 执行器的启动方式。允许选择的参数值有 `none`, `pytorch`, `slurm`, `mpi`。特别的，如果设置为none，测试将非分布式模式下进行。

docs/zh_cn/user_guides/4_train_test.md Outdated

+- `--launcher`: 用来分布式任务初始化运载器。允许选择的参数值有 `none`, `pytorch`, `slurm`, `mpi`。特别的，如果设置为none，测试将非分布式模式下进行。
+- `--local_rank`: 分布式中进程的序号。如果没有指定，默认设置为0。
+**注意：** 在config配置文件中 `--resume` 和 field `load_from` 的不同之处：

Collaborator

xiexinch Dec 2, 2022

Suggested change

**注意：** 在config配置文件中 `--resume` 和 field `load_from` 的不同之处：

**注意：** 命令行参数 `--resume` 和在配置文件中的参数 `load_from` 的不同之处：

docs/zh_cn/user_guides/4_train_test.md Outdated

+基础用法如下：
+```shell
+[GPUS=${GPUS}] sh tools/slurm_test.sh ${划分} ${进程名} ${配置文件} ${检查点文件} [可选参数]

Collaborator

xiexinch Dec 2, 2022

Suggested change

[GPUS=${GPUS}] sh tools/slurm_test.sh ${划分} ${进程名} ${配置文件} ${检查点文件} [可选参数]

[GPUS=${GPUS}] sh tools/slurm_test.sh ${分区} ${进程名} ${配置文件} ${检查点文件} [可选参数]

docs/zh_cn/user_guides/4_train_test.md Outdated

+[GPUS=${GPUS}] sh tools/slurm_test.sh ${划分} ${进程名} ${配置文件} ${检查点文件} [可选参数]
+```
+你可以检查 [the source code](../../../tools/slurm_test.sh) 来查看全部的参数和环境变量。

Collaborator

xiexinch Dec 2, 2022

Suggested change

你可以检查 [the source code](../../../tools/slurm_test.sh) 来查看全部的参数和环境变量。

你可以通过 [源码](../../../tools/slurm_test.sh) 来查看全部的参数和环境变量。

docs/zh_cn/user_guides/4_train_test.md Outdated

+ GPUS=4 GPUS_PER_NODE=4 sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${工作路径} --cfg-options env_cfg.dist_cfg.port=29501
+ ```
+. 通过修改config配置文件，设置不同的通讯端口：

Collaborator

xiexinch Dec 2, 2022

Suggested change

通过修改config配置文件，设置不同的通讯端口：

通过修改配置文件设置不同的通讯端口：

docs/zh_cn/user_guides/4_train_test.md Outdated

+ enf_cfg = dict(dist_cfg=dict(backend='nccl', port=29501))
+ ```
+ 然后你可以通过 config1.py 和 config2.py 同时进行两个任务：

Collaborator

xiexinch Dec 2, 2022

Suggested change

 然后你可以通过 config1.py 和 config2.py 同时进行两个任务：

 然后你可以通过 config1.py 和 config2.py 同时启动两个任务：

docs/zh_cn/user_guides/4_train_test.md Outdated

+ CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 sh tools/slurm_train.sh ${划分} ${进程名} config2.py ${工作路径}
+ ```
+. 使用环境变量设置命令中的端口 'MASTER_PORT'：

Collaborator

xiexinch Dec 2, 2022

Suggested change

使用环境变量设置命令中的端口 'MASTER_PORT'：

在命令行中通过环境变量 `MASTER_PORT` 设置端口 ：


modify part of content

xiexinch changed the title ~~[Doc] Add Chinese version of train & test tutorial~~ CodeCamp #147 [Doc] Add Chinese version of train & test tutorial

BLUE-coconut added 2 commits

December 5, 2022 23:28


changed parts of content

104a24d


modified

3d1f8b7

xiexinch approved these changes

View reviewed changes

docs/zh_cn/user_guides/4_train_test.md Outdated Show resolved Hide resolved


Update docs/zh_cn/user_guides/4_train_test.md

372a6ce

Co-authored-by: 谢昕辰 <[email protected]>

MeowZheng approved these changes

View reviewed changes

MeowZheng merged commit 7edb141 into open-mmlab:dev-1.x

MeowZheng pushed a commit to MeowZheng/mmsegmentation that referenced this pull request


CodeCamp open-mmlab#147 [Doc] Add Chinese version of train & test tut…

6eda566

…orial open-mmlab#2355

* doc

* modify part of content

* changed parts of content

* modified

* Update docs/zh_cn/user_guides/4_train_test.md

Co-authored-by: 谢昕辰 <[email protected]>

aravind-h-v pushed a commit to aravind-h-v/mmsegmentation that referenced this pull request


Fix 3-way merging with the checkpoint_merger community pipeline (open…

e3ddbe2

…-mmlab#2355)

correctly locate 3rd file; also correct misleading docs

wjkim81 pushed a commit to wjkim81/mmsegmentation that referenced this pull request


Merge pull request open-mmlab#2355 from open-mmlab/dev-1.x

2c4a60e

Dev 1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment