[Relay][Frontend][ONNX] New Operators and Opsets to Support BERT #4197

jwfromm · 2019-10-24T18:55:12Z

This PR is a sizable expansion to the relay frontend for Onnx. It includes new operators that allow BERT models using both opset-8 and opset-10 to be converted and run.

Here is a breakdown of all significant changes included in this PR.

Newer opsets of ONNX support quite a bit of dynamism in inputs. This manifests in some operations having expressions when the relay equivalent requires hard values. To handle these cases, I use the infer_values function from the tensorflow frontend. To reduce code reuse, I have moved this function to common.py.
I've added an extension to infer_values (infer_values_simulated). This function allows value inference even when not all inputs are defined by creating a dummy version with random values. This is needed in some cases when the value of the inputs doesn't effect the value of a shape tensor. It is very similar to what we used in the current Reshape converter.
Added a check to see if a MatMul operation is actually a batch_matmul and then convert it appropriately. Also added tests for batch_matmul.
Simplified opset v5 reshape by calling into infer_value_simulated.
Added opset v10 slice conversion and tests.
Added OneHot operation and tests.
Added ConstantOfShape operation and tests.
Removed ConstantFill operation as it has been deprecated for quite some time and replaced with the ConstantOfShape operation. The way we handled ConstantFill was not ideal since it was basically special-cased. Now that we have ConstantOfShape support it doesn't make sense to keep it around.
Added Unstack support to Split (Onnx uses Split for both regular split and unstack).
Added tile opset v6 and tests.
Added an opset override option to from_onnx. This is very useful for writing tests for different versions of an operator.

Although I personally have confirmed these changes allow BERT to be imported and run, I haven't added it to the onnx frontend tests as it is very slow. If reviewers think its worth increasing test time for then I'd be happy to add it.

Note that my implementation of ConstantOfShape does overlap with #4135 however there are some slight differences. For example I implement it using relay.full instead of relay.tile and
I do not use a special case for it in the initial node parsing. If the reviewers prefer the other version of ConstantOfShape, I'd be happy to remove my implementation from this PR.

jwfromm · 2019-10-24T19:16:47Z

@zhiics @kevinthesun @kazum, would any of you be able to take a look at this PR?

soiferj · 2019-10-24T20:30:24Z

Thanks for working on this! This is a huge improvement. Which implementation of BERT are you testing this change against?

python/tvm/relay/frontend/common.py

jwfromm · 2019-10-24T21:04:00Z

This is tested against both versions of BERT-squad.

zhiics

Thanks for improving the parser. Could you please upload your the test somewhere (like a branch in your personal git repo?)? I believe some ppl would be interested in playing with BERT.

python/tvm/relay/frontend/common.py

jwfromm · 2019-10-25T18:36:31Z

If anyone is interested, you can find my BERT test script here. The script shows how to run a question answering bert and displays the results in human readable text.

…tted.

…sts accordingly.

jwfromm · 2019-10-29T21:03:13Z

@kevinthesun, @zhiics, now that this is passing all tests can one of you do a review so we can get it merged?

zhiics

LGTM

zhiics · 2019-10-29T21:05:09Z

@soiferj Can you take another look as well? Thanks.

…che#4197) * Added slice v10 * Added constantofshape operation and small refactor. * Finished one_hot implementation. * Reshape working across all bert layers. * Fixed constantofshape and removed code duplication. * onnx model fully ingested. * Working on improving onnx tests. * Changed onnx testing to use onnxruntime instead of caffe2, also formatted. * Add arbitrary output nodes to onnx frontend. * Added v6 tiling for bert squad 8 support. * Small syntax fixes * Reduced code duplication in split opset versions. * Added batch matmul test * Added unstack split testing. * Adde onehot test, needs a little cleanup probably. * Replaced deprecated constant fill with constantofshape and updated tests accordingly. * Added tests for new opset version of slice and tile. * lint clean up * Lint fixes * Changed onnx dependency * Went back to caffe2 runtime for CI integration. * Rebase and small typo/syntax changes. * Added hard casting of onehot attributes to int.

* [relay][vm] Separate VM runtime with executable (apache#4100) * [relay][vm] Separate VM runtime with executable * Address comments * move ctx back to vm * make only vm related fields and methods protected * integrate seriliaztion/deserialization to executable * create stream * [Relay][Frontend][TF] Add tensor array ops (apache#3798) * [Relay][Frontend][TF] Add tensor array ops * rename * delete test * Move utility function * Refactor * fix tensor array ops * fix test * fix rebase * Fix serializer bug * Improve tf convert name lookup to use prelude api * Fix lint * Fix test * Fix typo (apache#4144) * [CI] Pin NNPack pthreadtools version (apache#4152) * [QNN][TFLite] Parsing QNN Add op. Adding MobilenetV2. (apache#4142) * Add lift_if_then_else pass (apache#3865) * Add LiftIfThenElse pass * Add more comments * Rename and refactor * Add description for internal data structure * Rename a test * Minor change * Address comments * Improve update_for * [CI] Update cpu docker (apache#4153) * [Refactor] Rename Datatype to ADT (apache#4156) We think it will reduce the confusion with the meaning. https://discuss.tvm.ai/t/discuss-consider-rename-vm-datatype/4339 * [Runtime] Enable option to use OpenMP thread pool (apache#4089) * [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. (apache#4161) * [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. This PR removes the original node system, and make node as a subclass of Object. This is a major refactor towards a better unified runtime object system. List of changes in the refactor: - We now hide data_ field, use Downcast explicitly to get a sub-class object. - Removed the node system FFI in python. - Removed the node C API, instead use PackedFunc for list and get attrs. - Change relay::Op::set_attr_type_key(attr_key_name) to relay::Op::set_attr_type<AttrType>(). - This change was necessary because of the new Object registration mechanism. - Subsequent changes to the op registrations - The change revealed a few previous problems that is now fixed. - Patched up a few missing node type registration. - Now we will raise an error if we register object that is not registered. - The original node.h and container.h are kept in the same location. - Calling convention: kObjectHandle now equals the old kNodeHandle, kNodeHandle is removed. - IRFunctor now dispatches on ObjectRef. - Update to the new type checking API: is_type, derived_from are replaced by IsInstance. - Removed .hash member function, instead use C++ convention hasher functors. * Address review comments * [CI] Move golang tests to the end (apache#4164) * Add support for quantized multiply to Relay (apache#4141) This patch adds multiply operator for quantized tensors. The details of the quantized multiplication are outlined in the code. This builds on pull request 3927 and includes the changes Animesh mentions in the comments on that request. Change-Id: I555715b53d0266a91d5c03dc3dfe8fc31e7ce4e1 * Fix missspelling (apache#4166) FIX "After connecting he usb" with "After connecting the usb" * [Relay][Pass] Count MAC for BatchMatMul (apache#4157) * count MAC for BatchMatMul * update doc * [Relay][QNN] Add unit test for int8 (apache#4159) * [bugfix][codegen] fix casting bug in llvm codegen * update example * retrigger ci * check llvm version * [relay][vm] Reuse allocated device memory (apache#4170) * add missing gradient check to gradient pass (apache#4169) * merge extract_from_program and extract_from_multiple_progam (apache#4173) * [TOPI] Added support for Mali Bifrost target (apache#4047) * [Relay][Frontend][TF] Fix Size operator (apache#4175) * [Relay][Frontend][TF] Fix Size operator * Uncomment tests * [Pass] Remove dead code (apache#4177) * [rpc] use callback func to do send & recv (apache#4147) * [rpc] use callback func to do send & recv. don't get fd from sock as it is deprecated in java * fix java build * fix min/max macro define in windows * keep the old rpc setup for py * add doc for CallbackChannel * Add support and testing for tf.assert (as no-op) and tf.no_op to TF Relay frontend. (apache#4172) * [DOCS] Add TensorFlow frontend docs (apache#4154) * Start to update TF frontend docs * Add rst * Remove markdown * Update wording * Resolve comments * Revert "[Relay][QNN] Add unit test for int8 (apache#4159)" (apache#4192) This reverts commit 6f9d028. * [cmake][ANTLR] Support setting path to ANTLR jar (apache#4176) * Support setting path to ANTLR jar * Update comment * Split adaptive_pool2d_avg into sum and div (apache#4186) * [Documentation]Fix example code in comment of tvm.build_module.build() (apache#4195) * Fix example code in comment of tvm.build_module.build() * Update build_module.py * [relay] use time_evaluator for measurement (apache#4191) * Add parser support for SUM tflite operator (apache#4182) * [Relay] Fix memory leak in the interpreter (apache#4155) * save lint * address reviewer comment * [TOPI] Tunable Template for Conv2D HWCN on CUDA (apache#4168) * support conv2d HWCN in AutoTVM and Relay * fix lint * fix comments and unit tests * TensorCore Support using Intrinsic (apache#4136) * add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic * [NODE][REFACTOR] Refactor reflection system in node. (apache#4189) * [NODE][REFACTOR] Refactor reflection system in node. - Removed the old Node, Node is now just an alias of runtime::Object - Introduce ReflectionVTable, a new columnar dispatcher to support reflection - This allows us to remove vtable from most node objects - The VisitAttrs are registered via TVM_RESGITER_NODE_TYPE, they are no longer virtual. - Consolidated serialization and reflection features into node. * Explicit type qualification when calling destructor. * Fix SPIRV, more comments * hotfix the ci (apache#4199) * [TOPI][x86] Legalize - Support int8xint8 convolution to use VNNI instructions. (apache#4196) * [Relay] crossentropy_with_logits and its gradient (apache#4075) * save * lint * [hotfix] missing include headers (apache#4204) * [Relay][Training] Add checkpoint annotation for checkpointing memory optimization (apache#4146) * add checkpoint annotation for checkpointing memory optimization * add alpha-equivalence checkpoint test and fix gradient type issue * fix build issues * ignore checkpoint annotation when checking missing gradients * refactor, fix checkpoint compute for tuple and add tests * [Relay][Params] Add APIs for storing and retrieving parameters from individual functions. (apache#4194) * Add support for attaching params * Fix types * Fix test * [Relay][Frontend][ONNX] Add support for op Where (apache#4184) * Add support for op Where * Update impl version * [VTA][Chisel] TSIM VTA Source Refactor (apache#4163) * app init push * fix on readme * change name, add bit serial explanantion * rm serialLoadMM, change doc * syntax change for readme * add parallel test functionality * fix readme * add python doc * syntax * init commit * fix empty line * fix typo * [RUNTIME] Separate runtime related contrib into runtime/contrib (apache#4207) * Fix type var docs (apache#4208) * [Relay] Setting Legalize opt_level to 1. (apache#4198) * [TOPI] Fix flaky testcase for check round (apache#4211) * [Relay][Op] Enhance Upsample Operator to support float scales (apache#4206) * :add scale2 for upsample * update unit test for upsampling * support latest upsample op for multiple frontend * fix lint * fix lint * fix lint * fix lint * update scale description and rebase * [Relay][Quantize] Use fixed point mulplications (apache#4160) * Update have_int8 condition to run on compute capability 7.x devices (apache#4214) * Optimizing autotvm task extraction speed (apache#4138) * Optimize task extraction speed * correct pylint errors * Delete unused function * remove unnecessary argument * resolve code review comments * corrent cpp lint errors * remove one more graph_json return value * fix test bugs * [Relay] Add Python type functor and tests (apache#4209) * Add Python type functor and tests * Lint roller * Fix typo in packed_func.h (apache#4219) * Improve the lowering of Qnn Dense (apache#4213) * [QNN] Improving Dense lowering. * - Moving get_shape method to util - Finalizing the test cases and the code structure for optimized dense computation. * - Fixing cpplint. * - Addressing review comments. * - Renaming the variables correctly. * - Renaming the variables correctly. * [ARITH] Fix the rule y < x && x <= y (apache#4220) * [PYTHON] Add __init__ to the generated grammar so that it can be installed properly (apache#4223) * [Relay][Frontend][ONNX] New Operators and Opsets to Support BERT (apache#4197) * Added slice v10 * Added constantofshape operation and small refactor. * Finished one_hot implementation. * Reshape working across all bert layers. * Fixed constantofshape and removed code duplication. * onnx model fully ingested. * Working on improving onnx tests. * Changed onnx testing to use onnxruntime instead of caffe2, also formatted. * Add arbitrary output nodes to onnx frontend. * Added v6 tiling for bert squad 8 support. * Small syntax fixes * Reduced code duplication in split opset versions. * Added batch matmul test * Added unstack split testing. * Adde onehot test, needs a little cleanup probably. * Replaced deprecated constant fill with constantofshape and updated tests accordingly. * Added tests for new opset version of slice and tile. * lint clean up * Lint fixes * Changed onnx dependency * Went back to caffe2 runtime for CI integration. * Rebase and small typo/syntax changes. * Added hard casting of onehot attributes to int. * [Relay][Topi][TensorFlow][ONNX][Lang] Add support for Any op (apache#4205) * Add support for Any op * Support ONNX frontend * Add doc * Add to relay docs * Dummy change to retrigger CI * Update dmlc_tvm_commit_id.txt * Merge from upstream

…che#4197) * Added slice v10 * Added constantofshape operation and small refactor. * Finished one_hot implementation. * Reshape working across all bert layers. * Fixed constantofshape and removed code duplication. * onnx model fully ingested. * Working on improving onnx tests. * Changed onnx testing to use onnxruntime instead of caffe2, also formatted. * Add arbitrary output nodes to onnx frontend. * Added v6 tiling for bert squad 8 support. * Small syntax fixes * Reduced code duplication in split opset versions. * Added batch matmul test * Added unstack split testing. * Adde onehot test, needs a little cleanup probably. * Replaced deprecated constant fill with constantofshape and updated tests accordingly. * Added tests for new opset version of slice and tile. * lint clean up * Lint fixes * Changed onnx dependency * Went back to caffe2 runtime for CI integration. * Rebase and small typo/syntax changes. * Added hard casting of onehot attributes to int.

jwfromm force-pushed the onnx_onehot branch from 4e92e3f to 075b147 Compare October 24, 2019 19:01

tqchen added the status: need review label Oct 24, 2019

jwfromm changed the title ~~[Relay] ONNX Frontend Extension~~ [Relay][Frontend][ONNX] New Operators and Opsets to Support BERT Oct 24, 2019

jwfromm mentioned this pull request Oct 24, 2019

[Relay][Frontend][ONNX] operator support: ConstantOfShape #4135

Closed

soiferj reviewed Oct 24, 2019

View reviewed changes

python/tvm/relay/frontend/common.py Show resolved Hide resolved

zhiics reviewed Oct 24, 2019

View reviewed changes

python/tvm/relay/frontend/common.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/common.py Show resolved Hide resolved

python/tvm/relay/frontend/common.py Outdated Show resolved Hide resolved

Josh Fromm and others added 20 commits October 28, 2019 10:29

Added slice v10

0f9e667

Added constantofshape operation and small refactor.

d7d0de3

Finished one_hot implementation.

019190c

Reshape working across all bert layers.

3826216

Fixed constantofshape and removed code duplication.

9442f6a

onnx model fully ingested.

c558ef7

Working on improving onnx tests.

01d2145

Changed onnx testing to use onnxruntime instead of caffe2, also forma…

05d8905

…tted.

Add arbitrary output nodes to onnx frontend.

bffad89

Added v6 tiling for bert squad 8 support.

94654af

Small syntax fixes

e9a2591

Reduced code duplication in split opset versions.

cb25dad

Added batch matmul test

123cae8

Added unstack split testing.

7f07ffa

Adde onehot test, needs a little cleanup probably.

7703198

Replaced deprecated constant fill with constantofshape and updated te…

4988d33

…sts accordingly.

Added tests for new opset version of slice and tile.

b7e2644

lint clean up

14737d2

Lint fixes

eea6fc4

Changed onnx dependency

0c108a7

jwfromm added 2 commits October 28, 2019 10:29

Went back to caffe2 runtime for CI integration.

89876cb

Rebase and small typo/syntax changes.

b18de8b

jwfromm force-pushed the onnx_onehot branch from bec6f4c to b18de8b Compare October 28, 2019 18:30

Added hard casting of onehot attributes to int.

bbf203c

zhiics approved these changes Oct 29, 2019

View reviewed changes

jroesch approved these changes Oct 30, 2019

View reviewed changes

jroesch merged commit 156aa59 into apache:master Oct 30, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

jwfromm deleted the onnx_onehot branch April 12, 2023 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay][Frontend][ONNX] New Operators and Opsets to Support BERT #4197

[Relay][Frontend][ONNX] New Operators and Opsets to Support BERT #4197

jwfromm commented Oct 24, 2019 •

edited

Loading

jwfromm commented Oct 24, 2019

soiferj commented Oct 24, 2019 •

edited

Loading

jwfromm commented Oct 24, 2019

zhiics left a comment

jwfromm commented Oct 25, 2019 •

edited

Loading

jwfromm commented Oct 29, 2019

zhiics left a comment

zhiics commented Oct 29, 2019

[Relay][Frontend][ONNX] New Operators and Opsets to Support BERT #4197

[Relay][Frontend][ONNX] New Operators and Opsets to Support BERT #4197

Conversation

jwfromm commented Oct 24, 2019 • edited Loading

jwfromm commented Oct 24, 2019

soiferj commented Oct 24, 2019 • edited Loading

jwfromm commented Oct 24, 2019

zhiics left a comment

Choose a reason for hiding this comment

jwfromm commented Oct 25, 2019 • edited Loading

jwfromm commented Oct 29, 2019

zhiics left a comment

Choose a reason for hiding this comment

zhiics commented Oct 29, 2019

jwfromm commented Oct 24, 2019 •

edited

Loading

soiferj commented Oct 24, 2019 •

edited

Loading

jwfromm commented Oct 25, 2019 •

edited

Loading