From 7083906647e696be67c3b8f61656de1488ac79f4 Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Sun, 13 Nov 2022 15:20:40 +0100 Subject: [PATCH 1/6] Update _toctree and clone original content --- docs/source/es/_toctree.yml | 2 + docs/source/es/serialization.mdx | 687 +++++++++++++++++++++++++++++++ 2 files changed, 689 insertions(+) create mode 100644 docs/source/es/serialization.mdx diff --git a/docs/source/es/_toctree.yml b/docs/source/es/_toctree.yml index 60566b9e6f9b47..2acc14f907c296 100644 --- a/docs/source/es/_toctree.yml +++ b/docs/source/es/_toctree.yml @@ -43,6 +43,8 @@ title: Modelos multiling眉es para inferencia - local: converting_tensorflow_models title: Convertir checkpoints de TensorFlow + - local: serialization + title: Serializaci贸n title: Gu铆as pr谩cticas - sections: - local: philosophy diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx new file mode 100644 index 00000000000000..0aacdf76f7ef0f --- /dev/null +++ b/docs/source/es/serialization.mdx @@ -0,0 +1,687 @@ + + +# Export 馃 Transformers Models + +If you need to deploy 馃 Transformers models in production environments, we +recommend exporting them to a serialized format that can be loaded and executed +on specialized runtimes and hardware. In this guide, we'll show you how to +export 馃 Transformers models in two widely used formats: ONNX and TorchScript. + +Once exported, a model can optimized for inference via techniques such as +quantization and pruning. If you are interested in optimizing your models to run +with maximum efficiency, check out the [馃 Optimum +library](https://github.com/huggingface/optimum). + +## ONNX + +The [ONNX (Open Neural Network eXchange)](http://onnx.ai) project is an open +standard that defines a common set of operators and a common file format to +represent deep learning models in a wide variety of frameworks, including +PyTorch and TensorFlow. When a model is exported to the ONNX format, these +operators are used to construct a computational graph (often called an +_intermediate representation_) which represents the flow of data through the +neural network. + +By exposing a graph with standardized operators and data types, ONNX makes it +easy to switch between frameworks. For example, a model trained in PyTorch can +be exported to ONNX format and then imported in TensorFlow (and vice versa). + +馃 Transformers provides a `transformers.onnx` package that enables you to +convert model checkpoints to an ONNX graph by leveraging configuration objects. +These configuration objects come ready made for a number of model architectures, +and are designed to be easily extendable to other architectures. + +Ready-made configurations include the following architectures: + + + +- ALBERT +- BART +- BEiT +- BERT +- BigBird +- BigBird-Pegasus +- Blenderbot +- BlenderbotSmall +- BLOOM +- CamemBERT +- CLIP +- CodeGen +- ConvBERT +- ConvNeXT +- Data2VecText +- Data2VecVision +- DeBERTa +- DeBERTa-v2 +- DeiT +- DETR +- DistilBERT +- ELECTRA +- FlauBERT +- GPT Neo +- GPT-J +- I-BERT +- LayoutLM +- LayoutLMv3 +- LeViT +- LongT5 +- M2M100 +- Marian +- mBART +- MobileBERT +- MobileViT +- MT5 +- OpenAI GPT-2 +- Perceiver +- PLBart +- ResNet +- RoBERTa +- RoFormer +- SqueezeBERT +- T5 +- ViT +- XLM +- XLM-RoBERTa +- XLM-RoBERTa-XL +- YOLOS + +In the next two sections, we'll show you how to: + +* Export a supported model using the `transformers.onnx` package. +* Export a custom model for an unsupported architecture. + +### Exporting a model to ONNX + +To export a 馃 Transformers model to ONNX, you'll first need to install some +extra dependencies: + +```bash +pip install transformers[onnx] +``` + +The `transformers.onnx` package can then be used as a Python module: + +```bash +python -m transformers.onnx --help + +usage: Hugging Face Transformers ONNX exporter [-h] -m MODEL [--feature {causal-lm, ...}] [--opset OPSET] [--atol ATOL] output + +positional arguments: + output Path indicating where to store generated ONNX model. + +optional arguments: + -h, --help show this help message and exit + -m MODEL, --model MODEL + Model ID on huggingface.co or path on disk to load model from. + --feature {causal-lm, ...} + The type of features to export the model with. + --opset OPSET ONNX opset version to export the model with. + --atol ATOL Absolute difference tolerence when validating the model. +``` + +Exporting a checkpoint using a ready-made configuration can be done as follows: + +```bash +python -m transformers.onnx --model=distilbert-base-uncased onnx/ +``` + +which should show the following logs: + +```bash +Validating ONNX model... + -[鉁揮 ONNX model output names match reference model ({'last_hidden_state'}) + - Validating ONNX Model output "last_hidden_state": + -[鉁揮 (2, 8, 768) matches (2, 8, 768) + -[鉁揮 all values close (atol: 1e-05) +All good, model saved at: onnx/model.onnx +``` + +This exports an ONNX graph of the checkpoint defined by the `--model` argument. +In this example it is `distilbert-base-uncased`, but it can be any checkpoint on +the Hugging Face Hub or one that's stored locally. + +The resulting `model.onnx` file can then be run on one of the [many +accelerators](https://onnx.ai/supported-tools.html#deployModel) that support the +ONNX standard. For example, we can load and run the model with [ONNX +Runtime](https://onnxruntime.ai/) as follows: + +```python +>>> from transformers import AutoTokenizer +>>> from onnxruntime import InferenceSession + +>>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") +>>> session = InferenceSession("onnx/model.onnx") +>>> # ONNX Runtime expects NumPy arrays as input +>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np") +>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs)) +``` + +The required output names (i.e. `["last_hidden_state"]`) can be obtained by +taking a look at the ONNX configuration of each model. For example, for +DistilBERT we have: + +```python +>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig + +>>> config = DistilBertConfig() +>>> onnx_config = DistilBertOnnxConfig(config) +>>> print(list(onnx_config.outputs.keys())) +["last_hidden_state"] +``` + +The process is identical for TensorFlow checkpoints on the Hub. For example, we +can export a pure TensorFlow checkpoint from the [Keras +organization](https://huggingface.co/keras-io) as follows: + +```bash +python -m transformers.onnx --model=keras-io/transformers-qa onnx/ +``` + +To export a model that's stored locally, you'll need to have the model's weights +and tokenizer files stored in a directory. For example, we can load and save a +checkpoint as follows: + + + +```python +>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification + +>>> # Load tokenizer and PyTorch weights form the Hub +>>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") +>>> pt_model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased") +>>> # Save to disk +>>> tokenizer.save_pretrained("local-pt-checkpoint") +>>> pt_model.save_pretrained("local-pt-checkpoint") +``` + +Once the checkpoint is saved, we can export it to ONNX by pointing the `--model` +argument of the `transformers.onnx` package to the desired directory: + +```bash +python -m transformers.onnx --model=local-pt-checkpoint onnx/ +``` + + +```python +>>> from transformers import AutoTokenizer, TFAutoModelForSequenceClassification + +>>> # Load tokenizer and TensorFlow weights from the Hub +>>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") +>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased") +>>> # Save to disk +>>> tokenizer.save_pretrained("local-tf-checkpoint") +>>> tf_model.save_pretrained("local-tf-checkpoint") +``` + +Once the checkpoint is saved, we can export it to ONNX by pointing the `--model` +argument of the `transformers.onnx` package to the desired directory: + +```bash +python -m transformers.onnx --model=local-tf-checkpoint onnx/ +``` + + + +### Selecting features for different model topologies + +Each ready-made configuration comes with a set of _features_ that enable you to +export models for different types of topologies or tasks. As shown in the table +below, each feature is associated with a different auto class: + +| Feature | Auto Class | +| ------------------------------------ | ------------------------------------ | +| `causal-lm`, `causal-lm-with-past` | `AutoModelForCausalLM` | +| `default`, `default-with-past` | `AutoModel` | +| `masked-lm` | `AutoModelForMaskedLM` | +| `question-answering` | `AutoModelForQuestionAnswering` | +| `seq2seq-lm`, `seq2seq-lm-with-past` | `AutoModelForSeq2SeqLM` | +| `sequence-classification` | `AutoModelForSequenceClassification` | +| `token-classification` | `AutoModelForTokenClassification` | + +For each configuration, you can find the list of supported features via the +`FeaturesManager`. For example, for DistilBERT we have: + +```python +>>> from transformers.onnx.features import FeaturesManager + +>>> distilbert_features = list(FeaturesManager.get_supported_features_for_model_type("distilbert").keys()) +>>> print(distilbert_features) +["default", "masked-lm", "causal-lm", "sequence-classification", "token-classification", "question-answering"] +``` + +You can then pass one of these features to the `--feature` argument in the +`transformers.onnx` package. For example, to export a text-classification model +we can pick a fine-tuned model from the Hub and run: + +```bash +python -m transformers.onnx --model=distilbert-base-uncased-finetuned-sst-2-english \ + --feature=sequence-classification onnx/ +``` + +which will display the following logs: + +```bash +Validating ONNX model... + -[鉁揮 ONNX model output names match reference model ({'logits'}) + - Validating ONNX Model output "logits": + -[鉁揮 (2, 2) matches (2, 2) + -[鉁揮 all values close (atol: 1e-05) +All good, model saved at: onnx/model.onnx +``` + +Notice that in this case, the output names from the fine-tuned model are +`logits` instead of the `last_hidden_state` we saw with the +`distilbert-base-uncased` checkpoint earlier. This is expected since the +fine-tuned model has a sequence classification head. + + + +The features that have a `with-past` suffix (e.g. `causal-lm-with-past`) +correspond to model topologies with precomputed hidden states (key and values +in the attention blocks) that can be used for fast autoregressive decoding. + + + + +### Exporting a model for an unsupported architecture + +If you wish to export a model whose architecture is not natively supported by +the library, there are three main steps to follow: + +1. Implement a custom ONNX configuration. +2. Export the model to ONNX. +3. Validate the outputs of the PyTorch and exported models. + +In this section, we'll look at how DistilBERT was implemented to show what's +involved with each step. + +#### Implementing a custom ONNX configuration + +Let's start with the ONNX configuration object. We provide three abstract +classes that you should inherit from, depending on the type of model +architecture you wish to export: + +* Encoder-based models inherit from [`~onnx.config.OnnxConfig`] +* Decoder-based models inherit from [`~onnx.config.OnnxConfigWithPast`] +* Encoder-decoder models inherit from [`~onnx.config.OnnxSeq2SeqConfigWithPast`] + + + +A good way to implement a custom ONNX configuration is to look at the existing +implementation in the `configuration_.py` file of a similar architecture. + + + +Since DistilBERT is an encoder-based model, its configuration inherits from +`OnnxConfig`: + +```python +>>> from typing import Mapping, OrderedDict +>>> from transformers.onnx import OnnxConfig + + +>>> class DistilBertOnnxConfig(OnnxConfig): +... @property +... def inputs(self) -> Mapping[str, Mapping[int, str]]: +... return OrderedDict( +... [ +... ("input_ids", {0: "batch", 1: "sequence"}), +... ("attention_mask", {0: "batch", 1: "sequence"}), +... ] +... ) +``` + +Every configuration object must implement the `inputs` property and return a +mapping, where each key corresponds to an expected input, and each value +indicates the axis of that input. For DistilBERT, we can see that two inputs are +required: `input_ids` and `attention_mask`. These inputs have the same shape of +`(batch_size, sequence_length)` which is why we see the same axes used in the +configuration. + + + +Notice that `inputs` property for `DistilBertOnnxConfig` returns an +`OrderedDict`. This ensures that the inputs are matched with their relative +position within the `PreTrainedModel.forward()` method when tracing the graph. +We recommend using an `OrderedDict` for the `inputs` and `outputs` properties +when implementing custom ONNX configurations. + + + +Once you have implemented an ONNX configuration, you can instantiate it by +providing the base model's configuration as follows: + +```python +>>> from transformers import AutoConfig + +>>> config = AutoConfig.from_pretrained("distilbert-base-uncased") +>>> onnx_config = DistilBertOnnxConfig(config) +``` + +The resulting object has several useful properties. For example you can view the +ONNX operator set that will be used during the export: + +```python +>>> print(onnx_config.default_onnx_opset) +11 +``` + +You can also view the outputs associated with the model as follows: + +```python +>>> print(onnx_config.outputs) +OrderedDict([("last_hidden_state", {0: "batch", 1: "sequence"})]) +``` + +Notice that the outputs property follows the same structure as the inputs; it +returns an `OrderedDict` of named outputs and their shapes. The output structure +is linked to the choice of feature that the configuration is initialised with. +By default, the ONNX configuration is initialized with the `default` feature +that corresponds to exporting a model loaded with the `AutoModel` class. If you +want to export a different model topology, just provide a different feature to +the `task` argument when you initialize the ONNX configuration. For example, if +we wished to export DistilBERT with a sequence classification head, we could +use: + +```python +>>> from transformers import AutoConfig + +>>> config = AutoConfig.from_pretrained("distilbert-base-uncased") +>>> onnx_config_for_seq_clf = DistilBertOnnxConfig(config, task="sequence-classification") +>>> print(onnx_config_for_seq_clf.outputs) +OrderedDict([('logits', {0: 'batch'})]) +``` + + + +All of the base properties and methods associated with [`~onnx.config.OnnxConfig`] and the +other configuration classes can be overriden if needed. Check out +[`BartOnnxConfig`] for an advanced example. + + + +#### Exporting the model + +Once you have implemented the ONNX configuration, the next step is to export the +model. Here we can use the `export()` function provided by the +`transformers.onnx` package. This function expects the ONNX configuration, along +with the base model and tokenizer, and the path to save the exported file: + +```python +>>> from pathlib import Path +>>> from transformers.onnx import export +>>> from transformers import AutoTokenizer, AutoModel + +>>> onnx_path = Path("model.onnx") +>>> model_ckpt = "distilbert-base-uncased" +>>> base_model = AutoModel.from_pretrained(model_ckpt) +>>> tokenizer = AutoTokenizer.from_pretrained(model_ckpt) + +>>> onnx_inputs, onnx_outputs = export(tokenizer, base_model, onnx_config, onnx_config.default_onnx_opset, onnx_path) +``` + +The `onnx_inputs` and `onnx_outputs` returned by the `export()` function are +lists of the keys defined in the `inputs` and `outputs` properties of the +configuration. Once the model is exported, you can test that the model is well +formed as follows: + +```python +>>> import onnx + +>>> onnx_model = onnx.load("model.onnx") +>>> onnx.checker.check_model(onnx_model) +``` + + + +If your model is larger than 2GB, you will see that many additional files are +created during the export. This is _expected_ because ONNX uses [Protocol +Buffers](https://developers.google.com/protocol-buffers/) to store the model and +these have a size limit of 2GB. See the [ONNX +documentation](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md) +for instructions on how to load models with external data. + + + +#### Validating the model outputs + +The final step is to validate that the outputs from the base and exported model +agree within some absolute tolerance. Here we can use the +`validate_model_outputs()` function provided by the `transformers.onnx` package +as follows: + +```python +>>> from transformers.onnx import validate_model_outputs + +>>> validate_model_outputs( +... onnx_config, tokenizer, base_model, onnx_path, onnx_outputs, onnx_config.atol_for_validation +... ) +``` + +This function uses the `OnnxConfig.generate_dummy_inputs()` method to generate +inputs for the base and exported model, and the absolute tolerance can be +defined in the configuration. We generally find numerical agreement in the 1e-6 +to 1e-4 range, although anything smaller than 1e-3 is likely to be OK. + +### Contributing a new configuration to 馃 Transformers + +We are looking to expand the set of ready-made configurations and welcome +contributions from the community! If you would like to contribute your addition +to the library, you will need to: + +* Implement the ONNX configuration in the corresponding `configuration_.py` +file +* Include the model architecture and corresponding features in [`~onnx.features.FeatureManager`] +* Add your model architecture to the tests in `test_onnx_v2.py` + +Check out how the configuration for [IBERT was +contributed](https://github.com/huggingface/transformers/pull/14868/files) to +get an idea of what's involved. + +## TorchScript + + + +This is the very beginning of our experiments with TorchScript and we are still exploring its capabilities with +variable-input-size models. It is a focus of interest to us and we will deepen our analysis in upcoming releases, +with more code examples, a more flexible implementation, and benchmarks comparing python-based codes with compiled +TorchScript. + + + +According to Pytorch's documentation: "TorchScript is a way to create serializable and optimizable models from PyTorch +code". Pytorch's two modules [JIT and TRACE](https://pytorch.org/docs/stable/jit.html) allow the developer to export +their model to be re-used in other programs, such as efficiency-oriented C++ programs. + +We have provided an interface that allows the export of 馃 Transformers models to TorchScript so that they can be reused +in a different environment than a Pytorch-based python program. Here we explain how to export and use our models using +TorchScript. + +Exporting a model requires two things: + +- a forward pass with dummy inputs. +- model instantiation with the `torchscript` flag. + +These necessities imply several things developers should be careful about. These are detailed below. + +### TorchScript flag and tied weights + +This flag is necessary because most of the language models in this repository have tied weights between their +`Embedding` layer and their `Decoding` layer. TorchScript does not allow the export of models that have tied +weights, therefore it is necessary to untie and clone the weights beforehand. + +This implies that models instantiated with the `torchscript` flag have their `Embedding` layer and `Decoding` +layer separate, which means that they should not be trained down the line. Training would de-synchronize the two +layers, leading to unexpected results. + +This is not the case for models that do not have a Language Model head, as those do not have tied weights. These models +can be safely exported without the `torchscript` flag. + +### Dummy inputs and standard lengths + +The dummy inputs are used to do a model forward pass. While the inputs' values are propagating through the layers, +Pytorch keeps track of the different operations executed on each tensor. These recorded operations are then used to +create the "trace" of the model. + +The trace is created relatively to the inputs' dimensions. It is therefore constrained by the dimensions of the dummy +input, and will not work for any other sequence length or batch size. When trying with a different size, an error such +as: + +`The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2` + +will be raised. It is therefore recommended to trace the model with a dummy input size at least as large as the largest +input that will be fed to the model during inference. Padding can be performed to fill the missing values. As the model +will have been traced with a large input size however, the dimensions of the different matrix will be large as well, +resulting in more calculations. + +It is recommended to be careful of the total number of operations done on each input and to follow performance closely +when exporting varying sequence-length models. + +### Using TorchScript in Python + +Below is an example, showing how to save, load models as well as how to use the trace for inference. + +#### Saving a model + +This snippet shows how to use TorchScript to export a `BertModel`. Here the `BertModel` is instantiated according +to a `BertConfig` class and then saved to disk under the filename `traced_bert.pt` + +```python +from transformers import BertModel, BertTokenizer, BertConfig +import torch + +enc = BertTokenizer.from_pretrained("bert-base-uncased") + +# Tokenizing input text +text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]" +tokenized_text = enc.tokenize(text) + +# Masking one of the input tokens +masked_index = 8 +tokenized_text[masked_index] = "[MASK]" +indexed_tokens = enc.convert_tokens_to_ids(tokenized_text) +segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1] + +# Creating a dummy input +tokens_tensor = torch.tensor([indexed_tokens]) +segments_tensors = torch.tensor([segments_ids]) +dummy_input = [tokens_tensor, segments_tensors] + +# Initializing the model with the torchscript flag +# Flag set to True even though it is not necessary as this model does not have an LM Head. +config = BertConfig( + vocab_size_or_config_json_file=32000, + hidden_size=768, + num_hidden_layers=12, + num_attention_heads=12, + intermediate_size=3072, + torchscript=True, +) + +# Instantiating the model +model = BertModel(config) + +# The model needs to be in evaluation mode +model.eval() + +# If you are instantiating the model with *from_pretrained* you can also easily set the TorchScript flag +model = BertModel.from_pretrained("bert-base-uncased", torchscript=True) + +# Creating the trace +traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors]) +torch.jit.save(traced_model, "traced_bert.pt") +``` + +#### Loading a model + +This snippet shows how to load the `BertModel` that was previously saved to disk under the name `traced_bert.pt`. +We are re-using the previously initialised `dummy_input`. + +```python +loaded_model = torch.jit.load("traced_bert.pt") +loaded_model.eval() + +all_encoder_layers, pooled_output = loaded_model(*dummy_input) +``` + +#### Using a traced model for inference + +Using the traced model for inference is as simple as using its `__call__` dunder method: + +```python +traced_model(tokens_tensor, segments_tensors) +``` + +### Deploying HuggingFace TorchScript models on AWS using the Neuron SDK + +AWS introduced the [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) +instance family for low cost, high performance machine learning inference in the cloud. +The Inf1 instances are powered by the AWS Inferentia chip, a custom-built hardware accelerator, +specializing in deep learning inferencing workloads. +[AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) +is the SDK for Inferentia that supports tracing and optimizing transformers models for +deployment on Inf1. The Neuron SDK provides: + + +1. Easy-to-use API with one line of code change to trace and optimize a TorchScript model for inference in the cloud. +2. Out of the box performance optimizations for [improved cost-performance](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/benchmark/>) +3. Support for HuggingFace transformers models built with either [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) + or [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html). + +#### Implications + +Transformers Models based on the [BERT (Bidirectional Encoder Representations from Transformers)](https://huggingface.co/docs/transformers/main/model_doc/bert) +architecture, or its variants such as [distilBERT](https://huggingface.co/docs/transformers/main/model_doc/distilbert) + and [roBERTa](https://huggingface.co/docs/transformers/main/model_doc/roberta) + will run best on Inf1 for non-generative tasks such as Extractive Question Answering, + Sequence Classification, Token Classification. Alternatively, text generation +tasks can be adapted to run on Inf1, according to this [AWS Neuron MarianMT tutorial](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). +More information about models that can be converted out of the box on Inferentia can be +found in the [Model Architecture Fit section of the Neuron documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia). + +#### Dependencies + +Using AWS Neuron to convert models requires the following dependencies and environment: + +* A [Neuron SDK environment](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/index.html#installation-guide), + which comes pre-configured on [AWS Deep Learning AMI](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-inferentia-launching.html). + +#### Converting a Model for AWS Neuron + +Using the same script as in [Using TorchScript in Python](https://huggingface.co/docs/transformers/main/en/serialization#using-torchscript-in-python) +to trace a "BertModel", you import `torch.neuron` framework extension to access +the components of the Neuron SDK through a Python API. + +```python +from transformers import BertModel, BertTokenizer, BertConfig +import torch +import torch.neuron +``` +And only modify the tracing line of code + +from: + +```python +torch.jit.trace(model, [tokens_tensor, segments_tensors]) +``` + +to: + +```python +torch.neuron.trace(model, [token_tensor, segments_tensors]) +``` + +This change enables Neuron SDK to trace the model and optimize it to run in Inf1 instances. + +To learn more about AWS Neuron SDK features, tools, example tutorials and latest updates, +please see the [AWS NeuronSDK documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html). From f85e07f2fdf6396e6a8ac5452cd531128a3dbf79 Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Sun, 13 Nov 2022 17:05:08 +0100 Subject: [PATCH 2/6] Translate first three sections --- docs/source/es/serialization.mdx | 153 +++++++++++++++++-------------- 1 file changed, 82 insertions(+), 71 deletions(-) diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx index 0aacdf76f7ef0f..0c2db999687e6b 100644 --- a/docs/source/es/serialization.mdx +++ b/docs/source/es/serialization.mdx @@ -10,38 +10,38 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o specific language governing permissions and limitations under the License. --> -# Export 馃 Transformers Models +# Exportar modelos 馃 Transformers -If you need to deploy 馃 Transformers models in production environments, we -recommend exporting them to a serialized format that can be loaded and executed -on specialized runtimes and hardware. In this guide, we'll show you how to -export 馃 Transformers models in two widely used formats: ONNX and TorchScript. +Si necesitas implementar modelos 馃 Transformers en entornos de producci贸n, te +recomendamos exportarlos a un formato serializado que se pueda cargar y ejecutar +en tiempos de ejecuci贸n y hardware especializados. En esta gu铆a, te mostraremos c贸mo +exportar modelos 馃 Transformers en dos formatos ampliamente utilizados: ONNX y TorchScript. -Once exported, a model can optimized for inference via techniques such as -quantization and pruning. If you are interested in optimizing your models to run -with maximum efficiency, check out the [馃 Optimum -library](https://github.com/huggingface/optimum). +Una vez exportado, un modelo puede optimizarse para la inferencia a trav茅s de t茅cnicas +como la cuantificaci贸n y _pruning_. Si est谩s interesado en optimizar tus modelos para +que funcionen con la m谩xima eficiencia, consulta la +[biblioteca de 馃 Optimum](https://github.com/huggingface/optimum). ## ONNX -The [ONNX (Open Neural Network eXchange)](http://onnx.ai) project is an open -standard that defines a common set of operators and a common file format to -represent deep learning models in a wide variety of frameworks, including -PyTorch and TensorFlow. When a model is exported to the ONNX format, these -operators are used to construct a computational graph (often called an -_intermediate representation_) which represents the flow of data through the -neural network. +El proyecto [ONNX (Open Neural Network eXchange)](http://onnx.ai) es un +est谩ndar abierto que define un conjunto com煤n de operadores y un formato +de archivo com煤n para representar modelos de aprendizaje profundo en una +amplia variedad de _frameworks_, incluidos PyTorch y TensorFlow. Cuando un modelo +se exporta al formato ONNX, estos operadores se usan para construir un +gr谩fico computacional (a menudo llamado _representaci贸n intermedia_) que +representa el flujo de datos a trav茅s de la red neuronal. -By exposing a graph with standardized operators and data types, ONNX makes it -easy to switch between frameworks. For example, a model trained in PyTorch can -be exported to ONNX format and then imported in TensorFlow (and vice versa). +Al exponer un gr谩fico con operadores y tipos de datos estandarizados, ONNX facilita +el cambio entre frameworks. Por ejemplo, un modelo entrenado en PyTorch se puede +exportar a formato ONNX y luego importar en TensorFlow (y viceversa). -馃 Transformers provides a `transformers.onnx` package that enables you to -convert model checkpoints to an ONNX graph by leveraging configuration objects. -These configuration objects come ready made for a number of model architectures, -and are designed to be easily extendable to other architectures. +馃 Transformers proporciona un paquete llamado `transformers.onnx`, el cual permite convertir +puntos de control de un modelo en un gr谩fico ONNX aprovechando los objetos de configuraci贸n. +Estos objetos de configuraci贸n est谩n hechos a la medida de diferentes arquitecturas de modelos +y est谩n dise帽ados para ser f谩cilmente extensibles a otras arquitecturas. -Ready-made configurations include the following architectures: +Las configuraciones a la medida incluyen las siguientes arquitecturas: @@ -95,21 +95,21 @@ Ready-made configurations include the following architectures: - XLM-RoBERTa-XL - YOLOS -In the next two sections, we'll show you how to: +En las pr贸ximas dos secciones, te mostraremos c贸mo: -* Export a supported model using the `transformers.onnx` package. -* Export a custom model for an unsupported architecture. +* Exportar un modelo compatible utilizando el paquete `transformers.onnx`. +* Exportar un modelo personalizado para una arquitectura no compatible. -### Exporting a model to ONNX +### Exportar un model a ONNX -To export a 馃 Transformers model to ONNX, you'll first need to install some -extra dependencies: +Para exportar un modelo 馃 Transformers a ONNX, tienes que instalar primer algunas +dependencias extra: ```bash pip install transformers[onnx] ``` -The `transformers.onnx` package can then be used as a Python module: +El paquete `transformers.onnx` puede ser usado luego como un m贸dulo de Python: ```bash python -m transformers.onnx --help @@ -129,13 +129,13 @@ optional arguments: --atol ATOL Absolute difference tolerence when validating the model. ``` -Exporting a checkpoint using a ready-made configuration can be done as follows: +Exportar un punto de control usando una configuraci贸n a la medida se puede hacer de la siguiente manera: ```bash python -m transformers.onnx --model=distilbert-base-uncased onnx/ ``` -which should show the following logs: +que deber铆a mostrar los siguientes registros: ```bash Validating ONNX model... @@ -146,14 +146,14 @@ Validating ONNX model... All good, model saved at: onnx/model.onnx ``` -This exports an ONNX graph of the checkpoint defined by the `--model` argument. -In this example it is `distilbert-base-uncased`, but it can be any checkpoint on -the Hugging Face Hub or one that's stored locally. +Esto exporta un gr谩fico ONNX del punto de control definido por el argumento `--model`. +En este ejemplo, es un modelo `distilbert-base-uncased`, pero puede ser cualquier punto +de control en Hugging Face Hub o que est茅 almacenado localmente. -The resulting `model.onnx` file can then be run on one of the [many -accelerators](https://onnx.ai/supported-tools.html#deployModel) that support the -ONNX standard. For example, we can load and run the model with [ONNX -Runtime](https://onnxruntime.ai/) as follows: +El archivo `model.onnx` resultante se puede ejecutar en uno de los +[muchos aceleradores](https://onnx.ai/supported-tools.html#deployModel) +que admiten el est谩ndar ONNX. Por ejemplo, podemos cargar y ejecutar el +modelo con [ONNX Runtime](https://onnxruntime.ai/) de la siguiente manera: ```python >>> from transformers import AutoTokenizer @@ -166,9 +166,8 @@ Runtime](https://onnxruntime.ai/) as follows: >>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs)) ``` -The required output names (i.e. `["last_hidden_state"]`) can be obtained by -taking a look at the ONNX configuration of each model. For example, for -DistilBERT we have: +Los nombres necesarios de salida (es decir, `["last_hidden_state"]`) se pueden obtener +echando un vistazo a la configuraci贸n ONNX de cada modelo. Por ejemplo, para DistilBERT tenemos: ```python >>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig @@ -176,20 +175,20 @@ DistilBERT we have: >>> config = DistilBertConfig() >>> onnx_config = DistilBertOnnxConfig(config) >>> print(list(onnx_config.outputs.keys())) -["last_hidden_state"] +["last_hidden_state"]s ``` -The process is identical for TensorFlow checkpoints on the Hub. For example, we -can export a pure TensorFlow checkpoint from the [Keras -organization](https://huggingface.co/keras-io) as follows: +El proceso es id茅ntico para los puntos de control de TensorFlow en Hub. +Por ejemplo, podemos exportar un punto de control puro de TensorFlow desde +[Keras](https://huggingface.co/keras-io) de la siguiente manera: ```bash python -m transformers.onnx --model=keras-io/transformers-qa onnx/ ``` -To export a model that's stored locally, you'll need to have the model's weights -and tokenizer files stored in a directory. For example, we can load and save a -checkpoint as follows: +Para exportar un modelo que est谩 almacenado localmente, deber谩s tener los pesos +y tokenizadores del modelo almacenados en un directorio. Por ejemplo, podemos cargar +y guardar un punto de control de la siguiente manera: @@ -204,8 +203,8 @@ checkpoint as follows: >>> pt_model.save_pretrained("local-pt-checkpoint") ``` -Once the checkpoint is saved, we can export it to ONNX by pointing the `--model` -argument of the `transformers.onnx` package to the desired directory: +Una vez que se guarda el punto de control, podemos exportarlo a ONNX usando el argumento `--model` +del paquete `transformers.onnx` al directorio deseado: ```bash python -m transformers.onnx --model=local-pt-checkpoint onnx/ @@ -223,8 +222,8 @@ python -m transformers.onnx --model=local-pt-checkpoint onnx/ >>> tf_model.save_pretrained("local-tf-checkpoint") ``` -Once the checkpoint is saved, we can export it to ONNX by pointing the `--model` -argument of the `transformers.onnx` package to the desired directory: +Una vez que se guarda el punto de control, podemos exportarlo a ONNX usando el argumento `--model` +del paquete `transformers.onnx` al directorio deseado: ```bash python -m transformers.onnx --model=local-tf-checkpoint onnx/ @@ -232,11 +231,11 @@ python -m transformers.onnx --model=local-tf-checkpoint onnx/ -### Selecting features for different model topologies +### Seleccionar caracter铆sticas para diferentes topolog铆as de un modelo -Each ready-made configuration comes with a set of _features_ that enable you to -export models for different types of topologies or tasks. As shown in the table -below, each feature is associated with a different auto class: +Cada configuraci贸n a la medida viene con un conjunto de _caracter铆sticas_ que te permiten exportar +modelos para diferentes tipos de topolog铆as o tareas. Como se muestra en la siguiente tabla, cada +funci贸n est谩 asociada con una auto-clase de autom贸vil diferente: | Feature | Auto Class | | ------------------------------------ | ------------------------------------ | @@ -248,8 +247,8 @@ below, each feature is associated with a different auto class: | `sequence-classification` | `AutoModelForSequenceClassification` | | `token-classification` | `AutoModelForTokenClassification` | -For each configuration, you can find the list of supported features via the -`FeaturesManager`. For example, for DistilBERT we have: +Para cada configuraci贸n, puedes encontrar la lista de funciones admitidas a trav茅s de `FeaturesManager`. +Por ejemplo, para DistilBERT tenemos: ```python >>> from transformers.onnx.features import FeaturesManager @@ -259,16 +258,15 @@ For each configuration, you can find the list of supported features via the ["default", "masked-lm", "causal-lm", "sequence-classification", "token-classification", "question-answering"] ``` -You can then pass one of these features to the `--feature` argument in the -`transformers.onnx` package. For example, to export a text-classification model -we can pick a fine-tuned model from the Hub and run: +Le puedes pasar una de estas caracter铆sticas al argumento `--feature` en el paquete `transformers.onnx`. +Por ejemplo, para exportar un modelo de clasificaci贸n de texto, podemos elegir un modelo ya ajustado del Hub y ejecutar: ```bash python -m transformers.onnx --model=distilbert-base-uncased-finetuned-sst-2-english \ --feature=sequence-classification onnx/ ``` -which will display the following logs: +que mostrar谩 los siguientes registros: ```bash Validating ONNX model... @@ -279,16 +277,15 @@ Validating ONNX model... All good, model saved at: onnx/model.onnx ``` -Notice that in this case, the output names from the fine-tuned model are -`logits` instead of the `last_hidden_state` we saw with the -`distilbert-base-uncased` checkpoint earlier. This is expected since the -fine-tuned model has a sequence classification head. +Ten en cuenta que, en este caso, los nombres de salida del modelo ajustado son `logits` en lugar de `last_hidden_state` +que vimos anteriormente con el punto de control `distilbert-base-uncased`. Esto es de esperarse ya que el modelo ajustado +tiene un inicio de clasificaci贸n secuencial. -The features that have a `with-past` suffix (e.g. `causal-lm-with-past`) -correspond to model topologies with precomputed hidden states (key and values -in the attention blocks) that can be used for fast autoregressive decoding. +Las caracter铆sticas que tienen un sufijo 'with-past' (por ejemplo, 'causal-lm-with-past') corresponden a topolog铆as +de modelo con estados ocultos precalculados (clave y valores en los bloques de atenci贸n) que se pueden usar para una +decodificaci贸n autorregresiva m谩s r谩pida. @@ -339,6 +336,7 @@ Since DistilBERT is an encoder-based model, its configuration inherits from ... ("attention_mask", {0: "batch", 1: "sequence"}), ... ] ... ) + ``` Every configuration object must implement the `inputs` property and return a @@ -366,6 +364,7 @@ providing the base model's configuration as follows: >>> config = AutoConfig.from_pretrained("distilbert-base-uncased") >>> onnx_config = DistilBertOnnxConfig(config) + ``` The resulting object has several useful properties. For example you can view the @@ -374,6 +373,7 @@ ONNX operator set that will be used during the export: ```python >>> print(onnx_config.default_onnx_opset) 11 + ``` You can also view the outputs associated with the model as follows: @@ -381,6 +381,7 @@ You can also view the outputs associated with the model as follows: ```python >>> print(onnx_config.outputs) OrderedDict([("last_hidden_state", {0: "batch", 1: "sequence"})]) + ``` Notice that the outputs property follows the same structure as the inputs; it @@ -400,6 +401,7 @@ use: >>> onnx_config_for_seq_clf = DistilBertOnnxConfig(config, task="sequence-classification") >>> print(onnx_config_for_seq_clf.outputs) OrderedDict([('logits', {0: 'batch'})]) + ``` @@ -428,6 +430,7 @@ with the base model and tokenizer, and the path to save the exported file: >>> tokenizer = AutoTokenizer.from_pretrained(model_ckpt) >>> onnx_inputs, onnx_outputs = export(tokenizer, base_model, onnx_config, onnx_config.default_onnx_opset, onnx_path) + ``` The `onnx_inputs` and `onnx_outputs` returned by the `export()` function are @@ -440,6 +443,7 @@ formed as follows: >>> onnx_model = onnx.load("model.onnx") >>> onnx.checker.check_model(onnx_model) + ``` @@ -466,6 +470,7 @@ as follows: >>> validate_model_outputs( ... onnx_config, tokenizer, base_model, onnx_path, onnx_outputs, onnx_config.atol_for_validation ... ) + ``` This function uses the `OnnxConfig.generate_dummy_inputs()` method to generate @@ -600,6 +605,7 @@ model = BertModel.from_pretrained("bert-base-uncased", torchscript=True) # Creating the trace traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors]) torch.jit.save(traced_model, "traced_bert.pt") + ``` #### Loading a model @@ -612,6 +618,7 @@ loaded_model = torch.jit.load("traced_bert.pt") loaded_model.eval() all_encoder_layers, pooled_output = loaded_model(*dummy_input) + ``` #### Using a traced model for inference @@ -620,6 +627,7 @@ Using the traced model for inference is as simple as using its `__call__` dunder ```python traced_model(tokens_tensor, segments_tensors) + ``` ### Deploying HuggingFace TorchScript models on AWS using the Neuron SDK @@ -666,6 +674,7 @@ the components of the Neuron SDK through a Python API. from transformers import BertModel, BertTokenizer, BertConfig import torch import torch.neuron + ``` And only modify the tracing line of code @@ -673,12 +682,14 @@ from: ```python torch.jit.trace(model, [tokens_tensor, segments_tensors]) + ``` to: ```python torch.neuron.trace(model, [token_tensor, segments_tensors]) + ``` This change enables Neuron SDK to trace the model and optimize it to run in Inf1 instances. From e8aa6ed322679709fa9416c031efa9215aa7e34b Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Tue, 15 Nov 2022 08:04:18 +0100 Subject: [PATCH 3/6] Add more translated chapters. Only 3 more left. --- docs/source/es/serialization.mdx | 268 +++++++++++++++---------------- 1 file changed, 127 insertions(+), 141 deletions(-) diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx index 0c2db999687e6b..705d93f1296748 100644 --- a/docs/source/es/serialization.mdx +++ b/docs/source/es/serialization.mdx @@ -279,7 +279,7 @@ All good, model saved at: onnx/model.onnx Ten en cuenta que, en este caso, los nombres de salida del modelo ajustado son `logits` en lugar de `last_hidden_state` que vimos anteriormente con el punto de control `distilbert-base-uncased`. Esto es de esperarse ya que el modelo ajustado -tiene un inicio de clasificaci贸n secuencial. +tiene un cabezal de clasificaci贸n secuencial. @@ -290,37 +290,35 @@ decodificaci贸n autorregresiva m谩s r谩pida. -### Exporting a model for an unsupported architecture +### Exportar un modelo para una arquitectura no compatible -If you wish to export a model whose architecture is not natively supported by -the library, there are three main steps to follow: +Si deseas exportar un modelo cuya arquitectura no es compatible de forma nativa +con la biblioteca, debes seguir tres pasos principales: -1. Implement a custom ONNX configuration. -2. Export the model to ONNX. -3. Validate the outputs of the PyTorch and exported models. +1. Implementa una configuraci贸n personalizada en ONNX. +2. Exporta el modelo a ONNX. +3. Valide los resultados de PyTorch y los modelos exportados. -In this section, we'll look at how DistilBERT was implemented to show what's -involved with each step. +En esta secci贸n, veremos c贸mo se implement贸 la serializaci贸n de DistilBERT +para mostrar lo que implica cada paso. -#### Implementing a custom ONNX configuration +#### Implementar una configuraci贸n personalizada en ONNX -Let's start with the ONNX configuration object. We provide three abstract -classes that you should inherit from, depending on the type of model -architecture you wish to export: +Comencemos con el objeto de configuraci贸n de ONNX. Proporcionamos tres clases abstractas +de las que debe heredar, seg煤n el tipo de arquitectura del modelo que quieras exportar: -* Encoder-based models inherit from [`~onnx.config.OnnxConfig`] -* Decoder-based models inherit from [`~onnx.config.OnnxConfigWithPast`] -* Encoder-decoder models inherit from [`~onnx.config.OnnxSeq2SeqConfigWithPast`] +* Modelos basados en el _Encoder_ inherente de [`~onnx.config.OnnxConfig`] +* Modelos basados en el _Decoder_ inherente de [`~onnx.config.OnnxConfigWithPast`] +* Modelos _Encoder-decoder_ inherente de [`~onnx.config.OnnxSeq2SeqConfigWithPast`] -A good way to implement a custom ONNX configuration is to look at the existing -implementation in the `configuration_.py` file of a similar architecture. +Una buena manera de implementar una configuraci贸n personalizada en ONNX es observar la implementaci贸n +existente en el archivo `configuration_.py` de una arquitectura similar. -Since DistilBERT is an encoder-based model, its configuration inherits from -`OnnxConfig`: +Dado que DistilBERT es un modelo de tipo _encoder-decoder_, su configuraci贸n se hereda de `OnnxConfig`: ```python >>> from typing import Mapping, OrderedDict @@ -339,25 +337,23 @@ Since DistilBERT is an encoder-based model, its configuration inherits from ``` -Every configuration object must implement the `inputs` property and return a -mapping, where each key corresponds to an expected input, and each value -indicates the axis of that input. For DistilBERT, we can see that two inputs are -required: `input_ids` and `attention_mask`. These inputs have the same shape of -`(batch_size, sequence_length)` which is why we see the same axes used in the -configuration. +Cada objeto de configuraci贸n debe implementar la propiedad `inputs` y devolver un mapeo, +donde cada clave corresponde a una entrada esperada y cada valor indica el eje de esa entrada. +Para DistilBERT, podemos ver que se requieren dos entradas: `input_ids` y `attention_mask`. +Estas entradas tienen la misma forma de `(batch_size, sequence_length)`, es por lo que vemos +los mismos ejes utilizados en la configuraci贸n. -Notice that `inputs` property for `DistilBertOnnxConfig` returns an -`OrderedDict`. This ensures that the inputs are matched with their relative -position within the `PreTrainedModel.forward()` method when tracing the graph. -We recommend using an `OrderedDict` for the `inputs` and `outputs` properties -when implementing custom ONNX configurations. +Observa que la propiedad `inputs` para `DistilBertOnnxConfig` devuelve un `OrderedDict`. +Esto nos asegura que las entradas coincidan con su posici贸n relativa dentro del m茅todo +`PreTrainedModel.forward()` al rastrear el gr谩fico. Recomendamos usar un `OrderedDict` +para las propiedades `inputs` y `outputs` al implementar configuraciones ONNX personalizadas. -Once you have implemented an ONNX configuration, you can instantiate it by -providing the base model's configuration as follows: +Una vez que hayas implementado una configuraci贸n ONNX, puedes crear una +instancia proporcionando la configuraci贸n del modelo base de la siguiente manera: ```python >>> from transformers import AutoConfig @@ -367,8 +363,8 @@ providing the base model's configuration as follows: ``` -The resulting object has several useful properties. For example you can view the -ONNX operator set that will be used during the export: +El objeto resultante tiene varias propiedades 煤tiles. Por ejemplo, puedes ver el conjunto de operadores ONNX que se +utilizar谩 durante la exportaci贸n: ```python >>> print(onnx_config.default_onnx_opset) @@ -376,7 +372,7 @@ ONNX operator set that will be used during the export: ``` -You can also view the outputs associated with the model as follows: +Tambi茅n puedes ver los resultados asociados con el modelo de la siguiente manera: ```python >>> print(onnx_config.outputs) @@ -384,15 +380,14 @@ OrderedDict([("last_hidden_state", {0: "batch", 1: "sequence"})]) ``` -Notice that the outputs property follows the same structure as the inputs; it -returns an `OrderedDict` of named outputs and their shapes. The output structure -is linked to the choice of feature that the configuration is initialised with. -By default, the ONNX configuration is initialized with the `default` feature -that corresponds to exporting a model loaded with the `AutoModel` class. If you -want to export a different model topology, just provide a different feature to -the `task` argument when you initialize the ONNX configuration. For example, if -we wished to export DistilBERT with a sequence classification head, we could -use: +Observa que la propiedad de salidas sigue la misma estructura que las entradas; +devuelve un objecto `OrderedDict` de salidas nombradas y sus formas. La estructura +de salida est谩 vinculada a la elecci贸n de la funci贸n con la que se inicializa la configuraci贸n. +Por defecto, la configuraci贸n de ONNX se inicializa con la funci贸n `default` que +corresponde a exportar un modelo cargado con la clase `AutoModel`. Si quieres exportar +una topolog铆a de modelo diferente, simplemente proporciona una caracter铆stica diferente +al argumento `task` cuando inicialices la configuraci贸n de ONNX. Por ejemplo, si quisi茅ramos +exportar DistilBERT con un cabezal de clasificaci贸n de secuencias, podr铆amos usar: ```python >>> from transformers import AutoConfig @@ -406,18 +401,18 @@ OrderedDict([('logits', {0: 'batch'})]) -All of the base properties and methods associated with [`~onnx.config.OnnxConfig`] and the -other configuration classes can be overriden if needed. Check out -[`BartOnnxConfig`] for an advanced example. +Todas las propiedades base y m茅todos asociados con [`~onnx.config.OnnxConfig`] y las +otras clases de configuraci贸n se pueden sobreescribir si es necesario. +Consulte [`BartOnnxConfig`] para ver un ejemplo avanzado. -#### Exporting the model +#### Exportar el modelo -Once you have implemented the ONNX configuration, the next step is to export the -model. Here we can use the `export()` function provided by the -`transformers.onnx` package. This function expects the ONNX configuration, along -with the base model and tokenizer, and the path to save the exported file: +Una vez que hayas implementado la configuraci贸n de ONNX, el siguiente paso es exportar el modelo. +Aqu铆 podemos usar la funci贸n `export()` proporcionada por el paquete `transformers.onnx`. +Esta funci贸n espera la configuraci贸n de ONNX, junto con el modelo base y el tokenizador, +y la ruta para guardar el archivo exportado: ```python >>> from pathlib import Path @@ -433,10 +428,9 @@ with the base model and tokenizer, and the path to save the exported file: ``` -The `onnx_inputs` and `onnx_outputs` returned by the `export()` function are -lists of the keys defined in the `inputs` and `outputs` properties of the -configuration. Once the model is exported, you can test that the model is well -formed as follows: +Los objetos `onnx_inputs` y `onnx_outputs` devueltos por la funci贸n `export()` +son listas de llaves definidas en las propiedades `inputs` y `outputs` de la configuraci贸n. +Una vez exportado el modelo, puedes probar que el modelo est谩 bien formado de la siguiente manera: ```python >>> import onnx @@ -448,21 +442,19 @@ formed as follows: -If your model is larger than 2GB, you will see that many additional files are -created during the export. This is _expected_ because ONNX uses [Protocol -Buffers](https://developers.google.com/protocol-buffers/) to store the model and -these have a size limit of 2GB. See the [ONNX -documentation](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md) -for instructions on how to load models with external data. +Si tu modelo tiene m谩s de 2GB, ver谩s que se crean muchos archivos adicionales durante la exportaci贸n. +Esto es _esperado_ porque ONNX usa [B煤feres de protocolo](https://developers.google.com/protocol-buffers/) +para almacenar el modelo y 茅stos tienen un l铆mite de tama帽o de 2 GB. Consulta la +[documentaci贸n de ONNX](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md) para obtener +instrucciones sobre c贸mo cargar modelos con datos externos. -#### Validating the model outputs +#### Validar los resultados del modelo -The final step is to validate that the outputs from the base and exported model -agree within some absolute tolerance. Here we can use the -`validate_model_outputs()` function provided by the `transformers.onnx` package -as follows: +El paso final es validar que los resultados del modelo base y exportado coincidan dentro +de cierta tolerancia absoluta. Aqu铆 podemos usar la funci贸n `validate_model_outputs()` +proporcionada por el paquete `transformers.onnx` de la siguiente manera: ```python >>> from transformers.onnx import validate_model_outputs @@ -473,93 +465,89 @@ as follows: ``` -This function uses the `OnnxConfig.generate_dummy_inputs()` method to generate -inputs for the base and exported model, and the absolute tolerance can be -defined in the configuration. We generally find numerical agreement in the 1e-6 -to 1e-4 range, although anything smaller than 1e-3 is likely to be OK. +Esta funci贸n usa el m茅todo `OnnxConfig.generate_dummy_inputs()` para generar entradas para el modelo base +y exportado, y la tolerancia absoluta se puede definir en la configuraci贸n. En general, encontramos una +concordancia num茅rica en el rango de 1e-6 a 1e-4, aunque es probable que cualquier valor menor que 1e-3 est茅 bien. -### Contributing a new configuration to 馃 Transformers +### Contribuir con una nueva configuraci贸n a 馃 Transformers -We are looking to expand the set of ready-made configurations and welcome -contributions from the community! If you would like to contribute your addition -to the library, you will need to: +隆Estamos buscando expandir el conjunto de configuraciones a la medida para usar y agradecemos las contribuciones de la comunidad! +Si deseas contribuir con su colaboraci贸n a la biblioteca, deber谩s: -* Implement the ONNX configuration in the corresponding `configuration_.py` -file -* Include the model architecture and corresponding features in [`~onnx.features.FeatureManager`] -* Add your model architecture to the tests in `test_onnx_v2.py` +* Implementa la configuraci贸n de ONNX en el archivo `configuration_.py` correspondiente +* Incluye la arquitectura del modelo y las caracter铆sticas correspondientes en [`~onnx.features.FeatureManager`] +* Agrega tu arquitectura de modelo a las pruebas en `test_onnx_v2.py` -Check out how the configuration for [IBERT was -contributed](https://github.com/huggingface/transformers/pull/14868/files) to -get an idea of what's involved. +Revisa c贸mo fue la contribuci贸n para la [configuraci贸n de IBERT](https://github.com/huggingface/transformers/pull/14868/files) +y as铆 tener una idea de lo que necesito. ## TorchScript -This is the very beginning of our experiments with TorchScript and we are still exploring its capabilities with -variable-input-size models. It is a focus of interest to us and we will deepen our analysis in upcoming releases, -with more code examples, a more flexible implementation, and benchmarks comparing python-based codes with compiled -TorchScript. +Este es el comienzo de nuestros experimentos con TorchScript y todav铆a estamos explorando sus capacidades con modelos de +tama帽o de entrada variable. Es un tema de inter茅s para nosotros y profundizaremos nuestro an谩lisis en las pr贸ximas +versiones, con m谩s ejemplos de c贸digo, una implementaci贸n m谩s flexible y puntos de referencia que comparen c贸digos +basados en Python con TorchScript compilado. -According to Pytorch's documentation: "TorchScript is a way to create serializable and optimizable models from PyTorch -code". Pytorch's two modules [JIT and TRACE](https://pytorch.org/docs/stable/jit.html) allow the developer to export -their model to be re-used in other programs, such as efficiency-oriented C++ programs. +Seg煤n la documentaci贸n de Pytorch: "TorchScript es una forma de crear modelos serializables y optimizables a partir del +c贸digo de PyTorch". Los dos m贸dulos de Pytorch [JIT y TRACE](https://pytorch.org/docs/stable/jit.html) permiten al +desarrollador exportar su modelo para reutilizarlo en otros programas, como los programas C++ orientados a la eficiencia. -We have provided an interface that allows the export of 馃 Transformers models to TorchScript so that they can be reused -in a different environment than a Pytorch-based python program. Here we explain how to export and use our models using -TorchScript. +Hemos proporcionado una interfaz que permite exportar modelos de 馃 Transformers a TorchScript para que puedan reutilizarse +en un entorno diferente al de un programa Python basado en Pytorch. Aqu铆 explicamos c贸mo exportar y usar nuestros modelos +usando TorchScript. -Exporting a model requires two things: +Exportar un modelo requiere de dos cosas: -- a forward pass with dummy inputs. -- model instantiation with the `torchscript` flag. +- un pase hacia adelante con entradas ficticias. +- instanciaci贸n del modelo con la indicador `torchscript`. -These necessities imply several things developers should be careful about. These are detailed below. +Estas necesidades implican varias cosas con las que los desarrolladores deben tener cuidado. 脡stas se detallan a continuaci贸n. -### TorchScript flag and tied weights +### Indicador de TorchScript y pesos atados -This flag is necessary because most of the language models in this repository have tied weights between their -`Embedding` layer and their `Decoding` layer. TorchScript does not allow the export of models that have tied -weights, therefore it is necessary to untie and clone the weights beforehand. +Este indicador es necesario porque la mayor铆a de los modelos de lenguaje en este repositorio tienen pesos vinculados entre su capa +de `Embedding` y su capa de `Decoding`. TorchScript no permite la exportaci贸n de modelos que tengan pesos atados, por lo que es +necesario desvincular y clonar los pesos previamente. -This implies that models instantiated with the `torchscript` flag have their `Embedding` layer and `Decoding` -layer separate, which means that they should not be trained down the line. Training would de-synchronize the two -layers, leading to unexpected results. +Esto implica que los modelos instanciados con el indicador `torchscript` tienen su capa `Embedding` y `Decoding` separadas, +lo que significa que no deben entrenarse m谩s adelante. El entrenamiento desincronizar铆a las dos capas, lo que generar铆a +resultados inesperados. -This is not the case for models that do not have a Language Model head, as those do not have tied weights. These models -can be safely exported without the `torchscript` flag. +Este no es el caso de los modelos que no tienen un cabezal de modelo de lenguaje, ya que no tienen pesos atados. +Estos modelos se pueden exportar de forma segura sin el indicador `torchscript`. -### Dummy inputs and standard lengths +### Entradas ficticias y longitudes est谩ndar -The dummy inputs are used to do a model forward pass. While the inputs' values are propagating through the layers, -Pytorch keeps track of the different operations executed on each tensor. These recorded operations are then used to -create the "trace" of the model. +Las entradas ficticias se utilizan para crear un modelo de pase hacia adelante. Mientras los valores de las entradas se +propagan a trav茅s de las capas, Pytorch realiza un seguimiento de las diferentes operaciones ejecutadas en cada tensor. +Estas operaciones registradas se utilizan luego para crear el "rastro" del modelo. -The trace is created relatively to the inputs' dimensions. It is therefore constrained by the dimensions of the dummy -input, and will not work for any other sequence length or batch size. When trying with a different size, an error such -as: +El rastro se crea en relaci贸n con las dimensiones de las entradas. Por lo tanto, est谩 limitado por las dimensiones de la +entrada ficticia y no funcionar谩 para ninguna otra longitud de secuencia o tama帽o de lote. Al intentar con un tama帽o diferente, +un error como: `The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2` -will be raised. It is therefore recommended to trace the model with a dummy input size at least as large as the largest -input that will be fed to the model during inference. Padding can be performed to fill the missing values. As the model -will have been traced with a large input size however, the dimensions of the different matrix will be large as well, -resulting in more calculations. +aparecer谩. Por lo tanto, se recomienda rastrear el modelo con un tama帽o de entrada ficticia al menos tan grande como la +entrada m谩s grande que se alimentar谩 al modelo durante la inferencia. El _padding_ se puede realizar para completar los +valores que faltan. Sin embargo, como el modelo se habr谩 rastreado con un tama帽o de entrada grande, las dimensiones de +las diferentes matrices tambi茅n ser谩n grandes, lo que dar谩 como resultado m谩s c谩lculos. -It is recommended to be careful of the total number of operations done on each input and to follow performance closely -when exporting varying sequence-length models. +Se recomienda tener cuidado con el n煤mero total de operaciones realizadas en cada entrada y seguir de cerca el rendimiento +al exportar modelos de longitud de secuencia variable. -### Using TorchScript in Python +### Usar TorchScript en Python -Below is an example, showing how to save, load models as well as how to use the trace for inference. +A continuaci贸n se muestra un ejemplo que muestra c贸mo guardar, cargar modelos y c贸mo usar el rastreo para la inferencia. -#### Saving a model +#### Guardando un modelo -This snippet shows how to use TorchScript to export a `BertModel`. Here the `BertModel` is instantiated according -to a `BertConfig` class and then saved to disk under the filename `traced_bert.pt` +Este fragmento muestra c贸mo usar TorchScript para exportar un `BertModel`. Aqu铆, el `BertModel` se instancia de acuerdo +con la clase `BertConfig` y luego se guarda en el disco con el nombre de archivo `traced_bert.pt` ```python from transformers import BertModel, BertTokenizer, BertConfig @@ -608,10 +596,10 @@ torch.jit.save(traced_model, "traced_bert.pt") ``` -#### Loading a model +#### Cargar un modelo -This snippet shows how to load the `BertModel` that was previously saved to disk under the name `traced_bert.pt`. -We are re-using the previously initialised `dummy_input`. +Este fragmento muestra c贸mo cargar el `BertModel` que se guard贸 previamente en el disco con el nombre `traced_bert.pt`. +Estamos reutilizando el `dummy_input` previamente inicializado. ```python loaded_model = torch.jit.load("traced_bert.pt") @@ -621,30 +609,28 @@ all_encoder_layers, pooled_output = loaded_model(*dummy_input) ``` -#### Using a traced model for inference +#### Usar un modelo rastreado para la inferencia -Using the traced model for inference is as simple as using its `__call__` dunder method: +Usar el modelo rastreado para la inferencia es tan simple como usar su m茅todo `__call__`: ```python traced_model(tokens_tensor, segments_tensors) ``` -### Deploying HuggingFace TorchScript models on AWS using the Neuron SDK +### Implementar los modelos HuggingFace TorchScript en AWS mediante Neuron SDK -AWS introduced the [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) -instance family for low cost, high performance machine learning inference in the cloud. -The Inf1 instances are powered by the AWS Inferentia chip, a custom-built hardware accelerator, -specializing in deep learning inferencing workloads. -[AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) -is the SDK for Inferentia that supports tracing and optimizing transformers models for -deployment on Inf1. The Neuron SDK provides: +AWS present贸 la familia de instancias [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) para la inferencia +de aprendizaje autom谩tico de bajo costo y alto rendimiento en la nube. Las instancias Inf1 funcionan con el chip AWS +Inferentia, un acelerador de hardware personalizado, que se especializa en cargas de trabajo de inferencia de aprendizaje +profundo. [AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) es el kit de desarrollo para Inferentia +que admite el rastreo y la optimizaci贸n de modelos de transformers para su implementaci贸n en Inf1. El SDK de Neuron proporciona: -1. Easy-to-use API with one line of code change to trace and optimize a TorchScript model for inference in the cloud. -2. Out of the box performance optimizations for [improved cost-performance](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/benchmark/>) -3. Support for HuggingFace transformers models built with either [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) - or [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html). +1. API f谩cil de usar con una l铆nea de cambio de c贸digo para rastrear y optimizar un modelo de TorchScript para la inferencia en la nube. +2. Optimizaciones de rendimiento listas para usar con un [costo-rendimiento mejorado](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/benchmark/>) +3. Soporte para modelos HuggingFace Transformers construidos con [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) +o [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html). #### Implications From 40105bcfeae612be59a9d120b998fa1556b3f123 Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Tue, 15 Nov 2022 22:21:35 +0100 Subject: [PATCH 4/6] Finish translation --- docs/source/es/serialization.mdx | 47 ++++++++++++++++---------------- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx index 705d93f1296748..aaa3519042560f 100644 --- a/docs/source/es/serialization.mdx +++ b/docs/source/es/serialization.mdx @@ -632,29 +632,30 @@ que admite el rastreo y la optimizaci贸n de modelos de transformers para su imp 3. Soporte para modelos HuggingFace Transformers construidos con [PyTorch](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) o [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html). -#### Implications +#### Implicaciones -Transformers Models based on the [BERT (Bidirectional Encoder Representations from Transformers)](https://huggingface.co/docs/transformers/main/model_doc/bert) -architecture, or its variants such as [distilBERT](https://huggingface.co/docs/transformers/main/model_doc/distilbert) - and [roBERTa](https://huggingface.co/docs/transformers/main/model_doc/roberta) - will run best on Inf1 for non-generative tasks such as Extractive Question Answering, - Sequence Classification, Token Classification. Alternatively, text generation -tasks can be adapted to run on Inf1, according to this [AWS Neuron MarianMT tutorial](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). -More information about models that can be converted out of the box on Inferentia can be -found in the [Model Architecture Fit section of the Neuron documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia). +Los modelos Transformers basados en la arquitectura +[BERT (Representaciones de _Enconder_ bidireccional de Transformers)](https://huggingface.co/docs/transformers/main/model_doc/bert), +o sus variantes, como [distilBERT](https://huggingface.co/docs/transformers/main/model_doc/distilbert) y +[roBERTa](https://huggingface.co/docs/transformers/main/model_doc/roberta), se ejecutar谩n mejor en Inf1 para tareas no +generativas, como la respuesta extractiva de preguntas, la clasificaci贸n de secuencias y la clasificaci贸n de tokens. +Como alternativa, las tareas de generaci贸n de texto se pueden adaptar para ejecutarse en Inf1, seg煤n este +[tutorial de AWS Neuron MarianMT](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). +Puedes encontrar m谩s informaci贸n sobre los modelos que est谩n listos para usarse en Inferentia en la +[secci贸n _Model Architecture Fit_ de la documentaci贸n de Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia). -#### Dependencies +#### Dependencias -Using AWS Neuron to convert models requires the following dependencies and environment: +Usar AWS Neuron para convertir modelos requiere las siguientes dependencias y entornos: -* A [Neuron SDK environment](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/index.html#installation-guide), - which comes pre-configured on [AWS Deep Learning AMI](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-inferentia-launching.html). +* Un [entorno Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/index.html#installation-guide), +que viene preconfigurado en [AWS Deep Learning AMI](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-inferentia-launching.html). -#### Converting a Model for AWS Neuron +#### Convertir un modelo a AWS Neuron -Using the same script as in [Using TorchScript in Python](https://huggingface.co/docs/transformers/main/en/serialization#using-torchscript-in-python) -to trace a "BertModel", you import `torch.neuron` framework extension to access -the components of the Neuron SDK through a Python API. +Con el mismo script usado en [Uso de TorchScript en Python](https://huggingface.co/docs/transformers/main/es/serialization#using-torchscript-in-python) 莽 +para rastrear un "BertModel", puedes importar la extensi贸n del _framework_ `torch.neuron` para acceder a los componentes +del SDK de Neuron a trav茅s de una API de Python. ```python from transformers import BertModel, BertTokenizer, BertConfig @@ -662,23 +663,21 @@ import torch import torch.neuron ``` -And only modify the tracing line of code - -from: +Y modificando la l铆nea de c贸digo de rastreo de: ```python torch.jit.trace(model, [tokens_tensor, segments_tensors]) ``` -to: +con lo siguiente: ```python torch.neuron.trace(model, [token_tensor, segments_tensors]) ``` -This change enables Neuron SDK to trace the model and optimize it to run in Inf1 instances. +Este cambio permite a Neuron SDK rastrear el modelo y optimizarlo para ejecutarse en instancias Inf1. -To learn more about AWS Neuron SDK features, tools, example tutorials and latest updates, -please see the [AWS NeuronSDK documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html). +Para obtener m谩s informaci贸n sobre las funciones, las herramientas, los tutoriales de ejemplo y las 煤ltimas actualizaciones +de AWS Neuron SDK, consulte la [documentaci贸n de AWS NeuronSDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html). From 2a4172f98fe2fa628efeebeafed74c9581d63b5c Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Tue, 15 Nov 2022 22:40:48 +0100 Subject: [PATCH 5/6] Run style from doc-builder --- docs/source/es/serialization.mdx | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx index aaa3519042560f..49e7571ce2b0b9 100644 --- a/docs/source/es/serialization.mdx +++ b/docs/source/es/serialization.mdx @@ -334,7 +334,6 @@ Dado que DistilBERT es un modelo de tipo _encoder-decoder_, su configuraci贸n se ... ("attention_mask", {0: "batch", 1: "sequence"}), ... ] ... ) - ``` Cada objeto de configuraci贸n debe implementar la propiedad `inputs` y devolver un mapeo, @@ -360,7 +359,6 @@ instancia proporcionando la configuraci贸n del modelo base de la siguiente maner >>> config = AutoConfig.from_pretrained("distilbert-base-uncased") >>> onnx_config = DistilBertOnnxConfig(config) - ``` El objeto resultante tiene varias propiedades 煤tiles. Por ejemplo, puedes ver el conjunto de operadores ONNX que se @@ -369,7 +367,6 @@ utilizar谩 durante la exportaci贸n: ```python >>> print(onnx_config.default_onnx_opset) 11 - ``` Tambi茅n puedes ver los resultados asociados con el modelo de la siguiente manera: @@ -377,7 +374,6 @@ Tambi茅n puedes ver los resultados asociados con el modelo de la siguiente maner ```python >>> print(onnx_config.outputs) OrderedDict([("last_hidden_state", {0: "batch", 1: "sequence"})]) - ``` Observa que la propiedad de salidas sigue la misma estructura que las entradas; @@ -396,7 +392,6 @@ exportar DistilBERT con un cabezal de clasificaci贸n de secuencias, podr铆amos u >>> onnx_config_for_seq_clf = DistilBertOnnxConfig(config, task="sequence-classification") >>> print(onnx_config_for_seq_clf.outputs) OrderedDict([('logits', {0: 'batch'})]) - ``` @@ -425,7 +420,6 @@ y la ruta para guardar el archivo exportado: >>> tokenizer = AutoTokenizer.from_pretrained(model_ckpt) >>> onnx_inputs, onnx_outputs = export(tokenizer, base_model, onnx_config, onnx_config.default_onnx_opset, onnx_path) - ``` Los objetos `onnx_inputs` y `onnx_outputs` devueltos por la funci贸n `export()` @@ -437,7 +431,6 @@ Una vez exportado el modelo, puedes probar que el modelo est谩 bien formado de l >>> onnx_model = onnx.load("model.onnx") >>> onnx.checker.check_model(onnx_model) - ``` @@ -462,7 +455,6 @@ proporcionada por el paquete `transformers.onnx` de la siguiente manera: >>> validate_model_outputs( ... onnx_config, tokenizer, base_model, onnx_path, onnx_outputs, onnx_config.atol_for_validation ... ) - ``` Esta funci贸n usa el m茅todo `OnnxConfig.generate_dummy_inputs()` para generar entradas para el modelo base @@ -593,7 +585,6 @@ model = BertModel.from_pretrained("bert-base-uncased", torchscript=True) # Creating the trace traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors]) torch.jit.save(traced_model, "traced_bert.pt") - ``` #### Cargar un modelo @@ -606,7 +597,6 @@ loaded_model = torch.jit.load("traced_bert.pt") loaded_model.eval() all_encoder_layers, pooled_output = loaded_model(*dummy_input) - ``` #### Usar un modelo rastreado para la inferencia @@ -615,7 +605,6 @@ Usar el modelo rastreado para la inferencia es tan simple como usar su m茅todo ` ```python traced_model(tokens_tensor, segments_tensors) - ``` ### Implementar los modelos HuggingFace TorchScript en AWS mediante Neuron SDK @@ -661,20 +650,17 @@ del SDK de Neuron a trav茅s de una API de Python. from transformers import BertModel, BertTokenizer, BertConfig import torch import torch.neuron - ``` Y modificando la l铆nea de c贸digo de rastreo de: ```python torch.jit.trace(model, [tokens_tensor, segments_tensors]) - ``` con lo siguiente: ```python torch.neuron.trace(model, [token_tensor, segments_tensors]) - ``` Este cambio permite a Neuron SDK rastrear el modelo y optimizarlo para ejecutarse en instancias Inf1. From 323f38f19975fb84df7a5f56c8ff9d8008de0871 Mon Sep 17 00:00:00 2001 From: Ian Castillo Date: Sat, 19 Nov 2022 11:41:41 +0100 Subject: [PATCH 6/6] Address recommended changes from reviewer --- docs/source/es/serialization.mdx | 48 ++++++++++++++++---------------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/docs/source/es/serialization.mdx b/docs/source/es/serialization.mdx index 49e7571ce2b0b9..4c42fd5d830ec4 100644 --- a/docs/source/es/serialization.mdx +++ b/docs/source/es/serialization.mdx @@ -18,7 +18,7 @@ en tiempos de ejecuci贸n y hardware especializados. En esta gu铆a, te mostraremo exportar modelos 馃 Transformers en dos formatos ampliamente utilizados: ONNX y TorchScript. Una vez exportado, un modelo puede optimizarse para la inferencia a trav茅s de t茅cnicas -como la cuantificaci贸n y _pruning_. Si est谩s interesado en optimizar tus modelos para +como la cuantizaci贸n y _pruning_. Si est谩s interesado en optimizar tus modelos para que funcionen con la m谩xima eficiencia, consulta la [biblioteca de 馃 Optimum](https://github.com/huggingface/optimum). @@ -29,15 +29,15 @@ est谩ndar abierto que define un conjunto com煤n de operadores y un formato de archivo com煤n para representar modelos de aprendizaje profundo en una amplia variedad de _frameworks_, incluidos PyTorch y TensorFlow. Cuando un modelo se exporta al formato ONNX, estos operadores se usan para construir un -gr谩fico computacional (a menudo llamado _representaci贸n intermedia_) que +grafo computacional (a menudo llamado _representaci贸n intermedia_) que representa el flujo de datos a trav茅s de la red neuronal. -Al exponer un gr谩fico con operadores y tipos de datos estandarizados, ONNX facilita +Al exponer un grafo con operadores y tipos de datos estandarizados, ONNX facilita el cambio entre frameworks. Por ejemplo, un modelo entrenado en PyTorch se puede exportar a formato ONNX y luego importar en TensorFlow (y viceversa). 馃 Transformers proporciona un paquete llamado `transformers.onnx`, el cual permite convertir -puntos de control de un modelo en un gr谩fico ONNX aprovechando los objetos de configuraci贸n. +los checkpoints de un modelo en un grafo ONNX aprovechando los objetos de configuraci贸n. Estos objetos de configuraci贸n est谩n hechos a la medida de diferentes arquitecturas de modelos y est谩n dise帽ados para ser f谩cilmente extensibles a otras arquitecturas. @@ -102,7 +102,7 @@ En las pr贸ximas dos secciones, te mostraremos c贸mo: ### Exportar un model a ONNX -Para exportar un modelo 馃 Transformers a ONNX, tienes que instalar primer algunas +Para exportar un modelo 馃 Transformers a ONNX, tienes que instalar primero algunas dependencias extra: ```bash @@ -129,7 +129,7 @@ optional arguments: --atol ATOL Absolute difference tolerence when validating the model. ``` -Exportar un punto de control usando una configuraci贸n a la medida se puede hacer de la siguiente manera: +Exportar un checkpoint usando una configuraci贸n a la medida se puede hacer de la siguiente manera: ```bash python -m transformers.onnx --model=distilbert-base-uncased onnx/ @@ -146,9 +146,9 @@ Validating ONNX model... All good, model saved at: onnx/model.onnx ``` -Esto exporta un gr谩fico ONNX del punto de control definido por el argumento `--model`. -En este ejemplo, es un modelo `distilbert-base-uncased`, pero puede ser cualquier punto -de control en Hugging Face Hub o que est茅 almacenado localmente. +Esto exporta un grafo ONNX del checkpoint definido por el argumento `--model`. +En este ejemplo, es un modelo `distilbert-base-uncased`, pero puede ser cualquier +checkpoint en Hugging Face Hub o que est茅 almacenado localmente. El archivo `model.onnx` resultante se puede ejecutar en uno de los [muchos aceleradores](https://onnx.ai/supported-tools.html#deployModel) @@ -178,8 +178,8 @@ echando un vistazo a la configuraci贸n ONNX de cada modelo. Por ejemplo, para Di ["last_hidden_state"]s ``` -El proceso es id茅ntico para los puntos de control de TensorFlow en Hub. -Por ejemplo, podemos exportar un punto de control puro de TensorFlow desde +El proceso es id茅ntico para los checkpoints de TensorFlow en Hub. +Por ejemplo, podemos exportar un checkpoint puro de TensorFlow desde [Keras](https://huggingface.co/keras-io) de la siguiente manera: ```bash @@ -188,7 +188,7 @@ python -m transformers.onnx --model=keras-io/transformers-qa onnx/ Para exportar un modelo que est谩 almacenado localmente, deber谩s tener los pesos y tokenizadores del modelo almacenados en un directorio. Por ejemplo, podemos cargar -y guardar un punto de control de la siguiente manera: +y guardar un checkpoint de la siguiente manera: @@ -203,7 +203,7 @@ y guardar un punto de control de la siguiente manera: >>> pt_model.save_pretrained("local-pt-checkpoint") ``` -Una vez que se guarda el punto de control, podemos exportarlo a ONNX usando el argumento `--model` +Una vez que se guarda el checkpoint, podemos exportarlo a ONNX usando el argumento `--model` del paquete `transformers.onnx` al directorio deseado: ```bash @@ -222,7 +222,7 @@ python -m transformers.onnx --model=local-pt-checkpoint onnx/ >>> tf_model.save_pretrained("local-tf-checkpoint") ``` -Una vez que se guarda el punto de control, podemos exportarlo a ONNX usando el argumento `--model` +Una vez que se guarda el checkpoint, podemos exportarlo a ONNX usando el argumento `--model` del paquete `transformers.onnx` al directorio deseado: ```bash @@ -278,7 +278,7 @@ All good, model saved at: onnx/model.onnx ``` Ten en cuenta que, en este caso, los nombres de salida del modelo ajustado son `logits` en lugar de `last_hidden_state` -que vimos anteriormente con el punto de control `distilbert-base-uncased`. Esto es de esperarse ya que el modelo ajustado +que vimos anteriormente con el checkpoint `distilbert-base-uncased`. Esto es de esperarse ya que el modelo ajustado tiene un cabezal de clasificaci贸n secuencial. @@ -318,7 +318,7 @@ existente en el archivo `configuration_.py` de una arquitectura simi -Dado que DistilBERT es un modelo de tipo _encoder-decoder_, su configuraci贸n se hereda de `OnnxConfig`: +Dado que DistilBERT es un modelo de tipo _encoder_, su configuraci贸n se hereda de `OnnxConfig`: ```python >>> from typing import Mapping, OrderedDict @@ -337,7 +337,7 @@ Dado que DistilBERT es un modelo de tipo _encoder-decoder_, su configuraci贸n se ``` Cada objeto de configuraci贸n debe implementar la propiedad `inputs` y devolver un mapeo, -donde cada clave corresponde a una entrada esperada y cada valor indica el eje de esa entrada. +donde cada llave corresponde a una entrada esperada y cada valor indica el eje de esa entrada. Para DistilBERT, podemos ver que se requieren dos entradas: `input_ids` y `attention_mask`. Estas entradas tienen la misma forma de `(batch_size, sequence_length)`, es por lo que vemos los mismos ejes utilizados en la configuraci贸n. @@ -346,7 +346,7 @@ los mismos ejes utilizados en la configuraci贸n. Observa que la propiedad `inputs` para `DistilBertOnnxConfig` devuelve un `OrderedDict`. Esto nos asegura que las entradas coincidan con su posici贸n relativa dentro del m茅todo -`PreTrainedModel.forward()` al rastrear el gr谩fico. Recomendamos usar un `OrderedDict` +`PreTrainedModel.forward()` al rastrear el grafo. Recomendamos usar un `OrderedDict` para las propiedades `inputs` y `outputs` al implementar configuraciones ONNX personalizadas. @@ -478,18 +478,18 @@ y as铆 tener una idea de lo que necesito. Este es el comienzo de nuestros experimentos con TorchScript y todav铆a estamos explorando sus capacidades con modelos de -tama帽o de entrada variable. Es un tema de inter茅s para nosotros y profundizaremos nuestro an谩lisis en las pr贸ximas -versiones, con m谩s ejemplos de c贸digo, una implementaci贸n m谩s flexible y puntos de referencia que comparen c贸digos +tama帽o de entrada variable. Es un tema de inter茅s y profundizaremos nuestro an谩lisis en las pr贸ximas +versiones, con m谩s ejemplos de c贸digo, una implementaci贸n m谩s flexible y puntos de referencia que comparen c贸digos basados en Python con TorchScript compilado. -Seg煤n la documentaci贸n de Pytorch: "TorchScript es una forma de crear modelos serializables y optimizables a partir del +Seg煤n la documentaci贸n de PyTorch: "TorchScript es una forma de crear modelos serializables y optimizables a partir del c贸digo de PyTorch". Los dos m贸dulos de Pytorch [JIT y TRACE](https://pytorch.org/docs/stable/jit.html) permiten al desarrollador exportar su modelo para reutilizarlo en otros programas, como los programas C++ orientados a la eficiencia. Hemos proporcionado una interfaz que permite exportar modelos de 馃 Transformers a TorchScript para que puedan reutilizarse -en un entorno diferente al de un programa Python basado en Pytorch. Aqu铆 explicamos c贸mo exportar y usar nuestros modelos +en un entorno diferente al de un programa Python basado en PyTorch. Aqu铆 explicamos c贸mo exportar y usar nuestros modelos usando TorchScript. Exportar un modelo requiere de dos cosas: @@ -515,7 +515,7 @@ Estos modelos se pueden exportar de forma segura sin el indicador `torchscript`. ### Entradas ficticias y longitudes est谩ndar Las entradas ficticias se utilizan para crear un modelo de pase hacia adelante. Mientras los valores de las entradas se -propagan a trav茅s de las capas, Pytorch realiza un seguimiento de las diferentes operaciones ejecutadas en cada tensor. +propagan a trav茅s de las capas, PyTorch realiza un seguimiento de las diferentes operaciones ejecutadas en cada tensor. Estas operaciones registradas se utilizan luego para crear el "rastro" del modelo. El rastro se crea en relaci贸n con las dimensiones de las entradas. Por lo tanto, est谩 limitado por las dimensiones de la @@ -642,7 +642,7 @@ que viene preconfigurado en [AWS Deep Learning AMI](https://docs.aws.amazon.com/ #### Convertir un modelo a AWS Neuron -Con el mismo script usado en [Uso de TorchScript en Python](https://huggingface.co/docs/transformers/main/es/serialization#using-torchscript-in-python) 莽 +Con el mismo script usado en [Uso de TorchScript en Python](https://huggingface.co/docs/transformers/main/es/serialization#using-torchscript-in-python) para rastrear un "BertModel", puedes importar la extensi贸n del _framework_ `torch.neuron` para acceder a los componentes del SDK de Neuron a trav茅s de una API de Python.