Merge from upstream (#53)

* [relay][vm] Separate VM runtime with executable (apache#4100) * [relay][vm] Separate VM runtime with executable * Address comments * move ctx back to vm * make only vm related fields and methods protected * integrate seriliaztion/deserialization to executable * create stream * [Relay][Frontend][TF] Add tensor array ops (apache#3798) * [Relay][Frontend][TF] Add tensor array ops * rename * delete test * Move utility function * Refactor * fix tensor array ops * fix test * fix rebase * Fix serializer bug * Improve tf convert name lookup to use prelude api * Fix lint * Fix test * Fix typo (apache#4144) * [CI] Pin NNPack pthreadtools version (apache#4152) * [QNN][TFLite] Parsing QNN Add op. Adding MobilenetV2. (apache#4142) * Add lift_if_then_else pass (apache#3865) * Add LiftIfThenElse pass * Add more comments * Rename and refactor * Add description for internal data structure * Rename a test * Minor change * Address comments * Improve update_for * [CI] Update cpu docker (apache#4153) * [Refactor] Rename Datatype to ADT (apache#4156) We think it will reduce the confusion with the meaning. https://discuss.tvm.ai/t/discuss-consider-rename-vm-datatype/4339 * [Runtime] Enable option to use OpenMP thread pool (apache#4089) * [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. (apache#4161) * [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. This PR removes the original node system, and make node as a subclass of Object. This is a major refactor towards a better unified runtime object system. List of changes in the refactor: - We now hide data_ field, use Downcast explicitly to get a sub-class object. - Removed the node system FFI in python. - Removed the node C API, instead use PackedFunc for list and get attrs. - Change relay::Op::set_attr_type_key(attr_key_name) to relay::Op::set_attr_type<AttrType>(). - This change was necessary because of the new Object registration mechanism. - Subsequent changes to the op registrations - The change revealed a few previous problems that is now fixed. - Patched up a few missing node type registration. - Now we will raise an error if we register object that is not registered. - The original node.h and container.h are kept in the same location. - Calling convention: kObjectHandle now equals the old kNodeHandle, kNodeHandle is removed. - IRFunctor now dispatches on ObjectRef. - Update to the new type checking API: is_type, derived_from are replaced by IsInstance. - Removed .hash member function, instead use C++ convention hasher functors. * Address review comments * [CI] Move golang tests to the end (apache#4164) * Add support for quantized multiply to Relay (apache#4141) This patch adds multiply operator for quantized tensors. The details of the quantized multiplication are outlined in the code. This builds on pull request 3927 and includes the changes Animesh mentions in the comments on that request. Change-Id: I555715b53d0266a91d5c03dc3dfe8fc31e7ce4e1 * Fix missspelling (apache#4166) FIX "After connecting he usb" with "After connecting the usb" * [Relay][Pass] Count MAC for BatchMatMul (apache#4157) * count MAC for BatchMatMul * update doc * [Relay][QNN] Add unit test for int8 (apache#4159) * [bugfix][codegen] fix casting bug in llvm codegen * update example * retrigger ci * check llvm version * [relay][vm] Reuse allocated device memory (apache#4170) * add missing gradient check to gradient pass (apache#4169) * merge extract_from_program and extract_from_multiple_progam (apache#4173) * [TOPI] Added support for Mali Bifrost target (apache#4047) * [Relay][Frontend][TF] Fix Size operator (apache#4175) * [Relay][Frontend][TF] Fix Size operator * Uncomment tests * [Pass] Remove dead code (apache#4177) * [rpc] use callback func to do send & recv (apache#4147) * [rpc] use callback func to do send & recv. don't get fd from sock as it is deprecated in java * fix java build * fix min/max macro define in windows * keep the old rpc setup for py * add doc for CallbackChannel * Add support and testing for tf.assert (as no-op) and tf.no_op to TF Relay frontend. (apache#4172) * [DOCS] Add TensorFlow frontend docs (apache#4154) * Start to update TF frontend docs * Add rst * Remove markdown * Update wording * Resolve comments * Revert "[Relay][QNN] Add unit test for int8 (apache#4159)" (apache#4192) This reverts commit 6f9d028. * [cmake][ANTLR] Support setting path to ANTLR jar (apache#4176) * Support setting path to ANTLR jar * Update comment * Split adaptive_pool2d_avg into sum and div (apache#4186) * [Documentation]Fix example code in comment of tvm.build_module.build() (apache#4195) * Fix example code in comment of tvm.build_module.build() * Update build_module.py * [relay] use time_evaluator for measurement (apache#4191) * Add parser support for SUM tflite operator (apache#4182) * [Relay] Fix memory leak in the interpreter (apache#4155) * save lint * address reviewer comment * [TOPI] Tunable Template for Conv2D HWCN on CUDA (apache#4168) * support conv2d HWCN in AutoTVM and Relay * fix lint * fix comments and unit tests * TensorCore Support using Intrinsic (apache#4136) * add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic * [NODE][REFACTOR] Refactor reflection system in node. (apache#4189) * [NODE][REFACTOR] Refactor reflection system in node. - Removed the old Node, Node is now just an alias of runtime::Object - Introduce ReflectionVTable, a new columnar dispatcher to support reflection - This allows us to remove vtable from most node objects - The VisitAttrs are registered via TVM_RESGITER_NODE_TYPE, they are no longer virtual. - Consolidated serialization and reflection features into node. * Explicit type qualification when calling destructor. * Fix SPIRV, more comments * hotfix the ci (apache#4199) * [TOPI][x86] Legalize - Support int8xint8 convolution to use VNNI instructions. (apache#4196) * [Relay] crossentropy_with_logits and its gradient (apache#4075) * save * lint * [hotfix] missing include headers (apache#4204) * [Relay][Training] Add checkpoint annotation for checkpointing memory optimization (apache#4146) * add checkpoint annotation for checkpointing memory optimization * add alpha-equivalence checkpoint test and fix gradient type issue * fix build issues * ignore checkpoint annotation when checking missing gradients * refactor, fix checkpoint compute for tuple and add tests * [Relay][Params] Add APIs for storing and retrieving parameters from individual functions. (apache#4194) * Add support for attaching params * Fix types * Fix test * [Relay][Frontend][ONNX] Add support for op Where (apache#4184) * Add support for op Where * Update impl version * [VTA][Chisel] TSIM VTA Source Refactor (apache#4163) * app init push * fix on readme * change name, add bit serial explanantion * rm serialLoadMM, change doc * syntax change for readme * add parallel test functionality * fix readme * add python doc * syntax * init commit * fix empty line * fix typo * [RUNTIME] Separate runtime related contrib into runtime/contrib (apache#4207) * Fix type var docs (apache#4208) * [Relay] Setting Legalize opt_level to 1. (apache#4198) * [TOPI] Fix flaky testcase for check round (apache#4211) * [Relay][Op] Enhance Upsample Operator to support float scales (apache#4206) * :add scale2 for upsample * update unit test for upsampling * support latest upsample op for multiple frontend * fix lint * fix lint * fix lint * fix lint * update scale description and rebase * [Relay][Quantize] Use fixed point mulplications (apache#4160) * Update have_int8 condition to run on compute capability 7.x devices (apache#4214) * Optimizing autotvm task extraction speed (apache#4138) * Optimize task extraction speed * correct pylint errors * Delete unused function * remove unnecessary argument * resolve code review comments * corrent cpp lint errors * remove one more graph_json return value * fix test bugs * [Relay] Add Python type functor and tests (apache#4209) * Add Python type functor and tests * Lint roller * Fix typo in packed_func.h (apache#4219) * Improve the lowering of Qnn Dense (apache#4213) * [QNN] Improving Dense lowering. * - Moving get_shape method to util - Finalizing the test cases and the code structure for optimized dense computation. * - Fixing cpplint. * - Addressing review comments. * - Renaming the variables correctly. * - Renaming the variables correctly. * [ARITH] Fix the rule y < x && x <= y (apache#4220) * [PYTHON] Add __init__ to the generated grammar so that it can be installed properly (apache#4223) * [Relay][Frontend][ONNX] New Operators and Opsets to Support BERT (apache#4197) * Added slice v10 * Added constantofshape operation and small refactor. * Finished one_hot implementation. * Reshape working across all bert layers. * Fixed constantofshape and removed code duplication. * onnx model fully ingested. * Working on improving onnx tests. * Changed onnx testing to use onnxruntime instead of caffe2, also formatted. * Add arbitrary output nodes to onnx frontend. * Added v6 tiling for bert squad 8 support. * Small syntax fixes * Reduced code duplication in split opset versions. * Added batch matmul test * Added unstack split testing. * Adde onehot test, needs a little cleanup probably. * Replaced deprecated constant fill with constantofshape and updated tests accordingly. * Added tests for new opset version of slice and tile. * lint clean up * Lint fixes * Changed onnx dependency * Went back to caffe2 runtime for CI integration. * Rebase and small typo/syntax changes. * Added hard casting of onehot attributes to int. * [Relay][Topi][TensorFlow][ONNX][Lang] Add support for Any op (apache#4205) * Add support for Any op * Support ONNX frontend * Add doc * Add to relay docs * Dummy change to retrigger CI * Update dmlc_tvm_commit_id.txt * Merge from upstream
neo-ai · Oct 31, 2019 · 5f7448c · 5f7448c
1 parent ea3f8a2
commit 5f7448c
Show file tree

Hide file tree

Showing 404 changed files with 11,925 additions and 6,095 deletions.
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -7,6 +7,7 @@ include(cmake/util/FindCUDA.cmake)
 include(cmake/util/FindVulkan.cmake)
 include(cmake/util/FindLLVM.cmake)
 include(cmake/util/FindROCM.cmake)
+include(cmake/util/FindANTLR.cmake)
 
 if(EXISTS ${CMAKE_CURRENT_BINARY_DIR}/config.cmake)
  include(${CMAKE_CURRENT_BINARY_DIR}/config.cmake)
@@ -33,6 +34,7 @@ tvm_option(USE_LLVM "Build with LLVM, can be set to specific llvm-config path" O
 tvm_option(USE_STACKVM_RUNTIME "Include stackvm into the runtime" OFF)
 tvm_option(USE_GRAPH_RUNTIME "Build with tiny graph runtime" ON)
 tvm_option(USE_GRAPH_RUNTIME_DEBUG "Build with tiny graph runtime debug mode" OFF)
+tvm_option(USE_OPENMP "Build with OpenMP thread pool implementation" OFF)
 tvm_option(USE_RELAY_DEBUG "Building Relay in debug mode..." OFF)
 tvm_option(USE_SGX "Build with SGX" OFF)
 tvm_option(USE_RTTI "Build with RTTI" ON)
@@ -155,6 +157,7 @@ list(APPEND COMPILER_SRCS ${RELAY_BACKEND_SRCS})
 list(APPEND COMPILER_SRCS ${RELAY_IR_SRCS})
 list(APPEND COMPILER_SRCS ${RELAY_QNN_SRCS})
 
+
 if(USE_VM_PROFILER)
  message(STATUS "Build compiler with Relay VM profiler support...")
  file(GLOB BACKEND_VM_PROFILER_SRCS src/relay/backend/vm/profiler/*.cc)
@@ -234,6 +237,7 @@ include(cmake/modules/VTA.cmake)
 include(cmake/modules/CUDA.cmake)
 include(cmake/modules/OpenCL.cmake)
 include(cmake/modules/OpenGL.cmake)
+include(cmake/modules/OpenMP.cmake)
 include(cmake/modules/Vulkan.cmake)
 include(cmake/modules/Metal.cmake)
 include(cmake/modules/ROCM.cmake)
@@ -267,6 +271,7 @@ add_library(tvm_topi SHARED ${TOPI_SRCS})
 add_library(tvm_runtime SHARED ${RUNTIME_SRCS})
 add_library(tvm_runtime_static STATIC ${RUNTIME_SRCS})
 
+
 if(USE_RELAY_DEBUG)
  message(STATUS "Building Relay in debug mode...")
  set_target_properties(tvm PROPERTIES COMPILE_DEFINITIONS "USE_RELAY_DEBUG")

diff --git a/Jenkinsfile b/Jenkinsfile
@@ -38,9 +38,15 @@
 // - Tag the new version as the lates
 // - Periodically cleanup the old versions on local workers
 //
+
+// Hashtag in the source to build current CI docker builds
+//
+// - ci-cpu:v0.54: e7c88a99f830de30814df14eaa980547ecbd61c1
+//
+
 ci_lint = "tvmai/ci-lint:v0.51"
 ci_gpu = "tvmai/ci-gpu:v0.54"
-ci_cpu = "tvmai/ci-cpu:v0.52"
+ci_cpu = "tvmai/ci-cpu:v0.54"
 ci_i386 = "tvmai/ci-i386:v0.52"
 
 // tvm libraries
@@ -195,10 +201,10 @@ stage('Build') {
  make(ci_cpu, 'build', '-j4')
  pack_lib('cpu', tvm_lib)
  timeout(time: max_time, unit: 'MINUTES') {
- sh "${docker_run} ${ci_cpu} ./tests/scripts/task_golang.sh"
  sh "${docker_run} ${ci_cpu} ./tests/scripts/task_python_unittest.sh"
  sh "${docker_run} ${ci_cpu} ./tests/scripts/task_python_integration.sh"
  sh "${docker_run} ${ci_cpu} ./tests/scripts/task_python_vta.sh"
+ sh "${docker_run} ${ci_cpu} ./tests/scripts/task_golang.sh"
  }
  }
  }

diff --git a/cmake/config.cmake b/cmake/config.cmake
@@ -115,6 +115,10 @@ set(USE_BLAS none)
 # set(USE_MKL_PATH <path to venv or site-packages directory>) if using `pip install mkl`
 set(USE_MKL_PATH none)
 
+# Whether use OpenMP thread pool, choices: gnu, intel
+# Note: "gnu" uses gomp library, "intel" uses iomp5 library
+set(USE_OPENMP none)
+
 # Whether use contrib.random in runtime
 set(USE_RANDOM OFF)
 
@@ -143,6 +147,10 @@ set(USE_SORT ON)
 # /path/to/tensorrt that contains include and lib dirs
 set(USE_TENSORRT OFF)
 # Build ANTLR parser for Relay text format
+# Possible values:
+# - ON: enable ANTLR by searching default locations (cmake find_program for antlr4 and /usr/local for jar)
+# - OFF: disable ANTLR
+# - /path/to/antlr-*-complete.jar: path to specific ANTLR jar file
 set(USE_ANTLR OFF)
 
 # Whether use Relay debug mode

diff --git a/cmake/modules/ANTLR.cmake b/cmake/modules/ANTLR.cmake
@@ -15,29 +15,7 @@
 # specific language governing permissions and limitations
 # under the License.
 if(USE_ANTLR)
- find_program(ANTLR4 antlr4)
-
- if (NOT ANTLR4)
- file(GLOB_RECURSE ANTLR4JAR
- /usr/local/lib/antlr-*-complete.jar
- /usr/local/Cellar/*antlr-*-complete.jar)
-
- # Get the first element of the list of antlr jars.
- # Sort and reverse the list so the item selected is the highest
- # version in lib or else in Cellar if no lib installation exists.
- list(SORT ANTLR4JAR)
- list(REVERSE ANTLR4JAR)
- list(GET ANTLR4JAR 0 ANTLR4JAR)
-
- set(JAVA_HOME $ENV{JAVA_HOME})
- if (NOT DEFINED JAVA_HOME)
- # Hack to get system to search for Java itself.
- set(JAVA_HOME "/usr")
- endif()
-
- set(ANTLR4 ${JAVA_HOME}/bin/java -jar ${ANTLR4JAR})
- endif()
-
+ find_antlr(${USE_ANTLR})
  if(ANTLR4)
 
  set(RELAY_PARSER_DIR

diff --git a/cmake/modules/CUDA.cmake b/cmake/modules/CUDA.cmake
@@ -40,15 +40,15 @@ if(USE_CUDA)
 
  if(USE_CUDNN)
  message(STATUS "Build with cuDNN support")
- file(GLOB CONTRIB_CUDNN_SRCS src/contrib/cudnn/*.cc)
+ file(GLOB CONTRIB_CUDNN_SRCS src/runtime/contrib/cudnn/*.cc)
  list(APPEND RUNTIME_SRCS ${CONTRIB_CUDNN_SRCS})
  list(APPEND TVM_RUNTIME_LINKER_LIBS ${CUDA_CUDNN_LIBRARY})
  include_directories(${USE_CUDNN}/include)
  endif(USE_CUDNN)
 
  if(USE_CUBLAS)
  message(STATUS "Build with cuBLAS support")
- file(GLOB CONTRIB_CUBLAS_SRCS src/contrib/cublas/*.cc)
+ file(GLOB CONTRIB_CUBLAS_SRCS src/runtime/contrib/cublas/*.cc)
  list(APPEND RUNTIME_SRCS ${CONTRIB_CUBLAS_SRCS})
  list(APPEND TVM_RUNTIME_LINKER_LIBS ${CUDA_CUBLAS_LIBRARY})
  endif(USE_CUBLAS)

diff --git a/cmake/modules/Metal.cmake b/cmake/modules/Metal.cmake
@@ -24,7 +24,7 @@ if(USE_METAL)
  list(APPEND RUNTIME_SRCS ${RUNTIME_METAL_SRCS})
 
  if(USE_MPS)
- file(GLOB MPS_CONTRIB_SRC src/contrib/mps/*.mm)
+ file(GLOB MPS_CONTRIB_SRC src/runtime/contrib/mps/*.mm)
  list(APPEND RUNTIME_SRCS ${MPS_CONTRIB_SRC})
  find_library(MPS_CONTRIB_LIB MetalPerformanceShaders)
  list(APPEND TVM_RUNTIME_LINKER_LIBS ${MPS_CONTRIB_LIB})

diff --git a/cmake/modules/OpenMP.cmake b/cmake/modules/OpenMP.cmake
@@ -0,0 +1,48 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# OpenMP Module
+if(USE_OPENMP STREQUAL "gnu")
+ find_package(OpenMP)
+ if(OPENMP_FOUND)
+ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
+ list(APPEND TVM_RUNTIME_LINKER_LIBS ${OpenMP_CXX_LIBRARIES})
+ add_definitions(-DTVM_THREADPOOL_USE_OPENMP=1)
+ message(STATUS "Build with OpenMP ${OpenMP_CXX_LIBRARIES}")
+ else()
+ add_definitions(-DTVM_THREADPOOL_USE_OPENMP=0)
+ message(WARNING "OpenMP cannot be found, use TVM threadpool instead.")
+ endif()
+elseif(USE_OPENMP STREQUAL "intel")
+ find_package(OpenMP)
+ if(OPENMP_FOUND)
+ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
+ if (MSVC)
+ find_library(OMP_LIBRARY NAMES libiomp5md)
+ else()
+ find_library(OMP_LIBRARY NAMES iomp5)
+ endif()
+ list(APPEND TVM_RUNTIME_LINKER_LIBS ${OMP_LIBRARY})
+ add_definitions(-DTVM_THREADPOOL_USE_OPENMP=1)
+ message(STATUS "Build with OpenMP " ${OMP_LIBRARY})
+ else()
+ add_definitions(-DTVM_THREADPOOL_USE_OPENMP=0)
+ message(WARNING "OpenMP cannot be found, use TVM threadpool instead.")
+ endif()
+else()
+ add_definitions(-DTVM_THREADPOOL_USE_OPENMP=0)
+endif()
diff --git a/cmake/modules/ROCM.cmake b/cmake/modules/ROCM.cmake
@@ -37,14 +37,14 @@ if(USE_ROCM)
 
  if(USE_MIOPEN)
  message(STATUS "Build with MIOpen support")
- file(GLOB MIOPEN_CONTRIB_SRCS src/contrib/miopen/*.cc)
+ file(GLOB MIOPEN_CONTRIB_SRCS src/runtime/contrib/miopen/*.cc)
  list(APPEND RUNTIME_SRCS ${MIOPEN_CONTRIB_SRCS})
  list(APPEND TVM_RUNTIME_LINKER_LIBS ${ROCM_MIOPEN_LIBRARY})
  endif(USE_MIOPEN)
 
  if(USE_ROCBLAS)
  message(STATUS "Build with RocBLAS support")
- file(GLOB ROCBLAS_CONTRIB_SRCS src/contrib/rocblas/*.cc)
+ file(GLOB ROCBLAS_CONTRIB_SRCS src/runtime/contrib/rocblas/*.cc)
  list(APPEND RUNTIME_SRCS ${ROCBLAS_CONTRIB_SRCS})
  list(APPEND TVM_RUNTIME_LINKER_LIBS ${ROCM_ROCBLAS_LIBRARY})
  endif(USE_ROCBLAS)

diff --git a/cmake/modules/contrib/BLAS.cmake b/cmake/modules/contrib/BLAS.cmake
@@ -16,7 +16,7 @@
 # under the License.
 
 # Plugin rules for cblas
-file(GLOB CBLAS_CONTRIB_SRC src/contrib/cblas/*.cc)
+file(GLOB CBLAS_CONTRIB_SRC src/runtime/contrib/cblas/*.cc)
 
 if(USE_BLAS STREQUAL "openblas")
  find_library(BLAS_LIBRARY openblas)

diff --git a/cmake/modules/contrib/NNPack.cmake b/cmake/modules/contrib/NNPack.cmake
@@ -20,7 +20,7 @@ if(USE_NNPACK)
  set(NNPACK_PATH ${CMAKE_CURRENT_SOURCE_DIR}/NNPack)
  endif()
  set(PTHREAD_POOL_PATH ${NNPACK_PATH}/deps/pthreadpool)
- file(GLOB NNPACK_CONTRIB_SRC src/contrib/nnpack/*.cc)
+ file(GLOB NNPACK_CONTRIB_SRC src/runtime/contrib/nnpack/*.cc)
  list(APPEND RUNTIME_SRCS ${NNPACK_CONTRIB_SRC})
  include_directories(${NNPACK_PATH}/include)
  include_directories(${PTHREAD_POOL_PATH}/include)

diff --git a/cmake/modules/contrib/Random.cmake b/cmake/modules/contrib/Random.cmake
@@ -17,6 +17,6 @@
 
 if(USE_RANDOM)
  message(STATUS "Build with contrib.random")
- file(GLOB RANDOM_CONTRIB_SRC src/contrib/random/random.cc)
+ file(GLOB RANDOM_CONTRIB_SRC src/runtime/contrib/random/random.cc)
  list(APPEND RUNTIME_SRCS ${RANDOM_CONTRIB_SRC})
 endif(USE_RANDOM)
diff --git a/cmake/modules/contrib/Sort.cmake b/cmake/modules/contrib/Sort.cmake
@@ -17,6 +17,6 @@
 
 if(USE_SORT)
  message(STATUS "Build with contrib.sort")
- file(GLOB SORT_CONTRIB_SRC src/contrib/sort/*.cc)
+ file(GLOB SORT_CONTRIB_SRC src/runtime/contrib/sort/*.cc)
  list(APPEND RUNTIME_SRCS ${SORT_CONTRIB_SRC})
 endif(USE_SORT)
diff --git a/cmake/util/FindANTLR.cmake b/cmake/util/FindANTLR.cmake
@@ -0,0 +1,65 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+#######################################################
+# Enhanced version of find ANTLR.
+#
+# Usage:
+# find_antlr(${USE_ANTLR})
+#
+# - When USE_ANTLR=ON, use auto search by first trying to find antlr4 program,
+# then trying to find antlr-*-complete.jar
+# - When USE_ANTLR=/path/to/antlr-*-complete.jar, use provided jar
+#
+# Provide variables:
+# - ANTLR4
+#
+macro(find_antlr use_antlr)
+ set(JAVA_HOME $ENV{JAVA_HOME})
+ if (NOT DEFINED JAVA_HOME)
+ # Hack to get system to search for Java itself.
+ message(STATUS "JAVA_HOME is not defined. Set it to ensure proper use")
+ set(JAVA_HOME "/usr")
+ endif()
+ if(MSVC)
+ set(JAVA_PROGRAM ${JAVA_HOME}/java.exe)
+ else()
+ set(JAVA_PROGRAM ${JAVA_HOME}/bin/java)
+ endif()
+ message(STATUS "Using Java at " ${JAVA_PROGRAM})
+
+ if (${use_antlr} STREQUAL "ON")
+ find_program(ANTLR4 antlr4)
+ if (NOT ANTLR4)
+ file(GLOB_RECURSE ANTLR4JAR
+ /usr/local/lib/antlr-*-complete.jar
+ /usr/local/Cellar/*antlr-*-complete.jar)
+
+ # Get the first element of the list of antlr jars.
+ # Sort and reverse the list so the item selected is the highest
+ # version in lib or else in Cellar if no lib installation exists.
+ list(SORT ANTLR4JAR)
+ list(REVERSE ANTLR4JAR)
+ list(GET ANTLR4JAR 0 ANTLR4JAR)
+
+ set(ANTLR4 ${JAVA_PROGRAM} -jar ${ANTLR4JAR})
+ endif()
+ elseif(NOT ${use_antlr} STREQUAL "OFF")
+ set(ANTLR4 ${JAVA_PROGRAM} -jar ${use_antlr})
+ endif()
+ message(STATUS "ANTLR4="${ANTLR4})
+endmacro(find_antlr)
diff --git a/dmlc_tvm_commit_id.txt b/dmlc_tvm_commit_id.txt
@@ -1 +1 @@
-cf046972eb5602c2d1b67edea230f6ca07c966b1
+76c8ead492b7646d1c531a78314174761093510d
diff --git a/docker/install/ubuntu_install_nnpack.sh b/docker/install/ubuntu_install_nnpack.sh
@@ -6,9 +6,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License. You may obtain a copy of the License at
-# 
+#
 # http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -22,11 +22,14 @@ set -o pipefail
 
 apt-get update && apt-get install -y --no-install-recommends git cmake
 
-# TODO: specific tag?
 git clone https:/Maratyszcza/NNPACK NNPACK
+git clone https:/Maratyszcza/pthreadpool NNPACK/pthreadpool
+
+# Use specific versioning tag.
 (cd NNPACK && git checkout 1e005b0c2)
+(cd NNPACK/pthreadpool && git checkout 13da0b4c)
 
 mkdir -p NNPACK/build
 cd NNPACK/build
-cmake -DCMAKE_INSTALL_PREFIX:PATH=. -DNNPACK_INFERENCE_ONLY=OFF -DNNPACK_CONVOLUTION_ONLY=OFF -DNNPACK_BUILD_TESTS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON .. && make -j4 && make install
+cmake -DCMAKE_INSTALL_PREFIX:PATH=. -DNNPACK_INFERENCE_ONLY=OFF -DNNPACK_CONVOLUTION_ONLY=OFF -DNNPACK_BUILD_TESTS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DPTHREADPOOL_SOURCE_DIR=pthreadpool .. && make -j4 && make install
 cd -
diff --git a/docs/api/python/topi.rst b/docs/api/python/topi.rst
@@ -91,6 +91,7 @@ List of operators
  topi.greater_equal
  topi.less_equal
  topi.all
+ topi.any
  topi.logical_and
  topi.logical_or
  topi.logical_not
@@ -151,6 +152,7 @@ topi
 .. autofunction:: topi.full
 .. autofunction:: topi.full_like
 .. autofunction:: topi.all
+.. autofunction:: topi.any
 .. autofunction:: topi.max
 .. autofunction:: topi.sum
 .. autofunction:: topi.min

diff --git a/docs/dev/virtual_machine.rst b/docs/dev/virtual_machine.rst
@@ -121,7 +121,7 @@ AllocTensor
 Allocate a tensor value of the appropriate shape (stored in `shape_register`) and `dtype`. The result
 is saved to register `dst`.
 
-AllocDatatype
+AllocADT
 ^^^^^^^^^^^^^
 **Arguments**:
 ::
@@ -176,7 +176,7 @@ GetTagi
  RegName object
  RegName dst
 
-Get the object tag for Datatype object in register `object`. And saves the reult to register `dst`.
+Get the object tag for ADT object in register `object`. And saves the reult to register `dst`.
 
 Fatal
 ^^^^^
@@ -251,9 +251,9 @@ Currently, we support 3 types of objects: tensors, data types, and closures.
 
 ::
 
- VMObject VMTensor(const tvm::runtime::NDArray& data);
- VMObject VMDatatype(size_t tag, const std::vector<VMObject>& fields);
- VMObject VMClosure(size_t func_index, std::vector<VMObject> free_vars);
+ Object Tensor(const tvm::runtime::NDArray& data);
+ Object ADT(size_t tag, const std::vector<Object>& fields);
+ Object Closure(size_t func_index, std::vector<Object> free_vars);
 
 
 Stack and State