Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llava-single on HPU #300

Open
Delaunay opened this issue Oct 4, 2024 · 2 comments
Open

llava-single on HPU #300

Delaunay opened this issue Oct 4, 2024 · 2 comments
Labels

Comments

@Delaunay
Copy link
Collaborator

Delaunay commented Oct 4, 2024

Eager Mode

PT_HPU_LAZY_MODE=0 bash $MILABENCH_SOURCE/scripts/article/run_hpu.sh --select llava-single > out.out 2>&1
Breakdown
---------
bench        | fail |   n | ngpu |       perf |   sem% |   std% | peak_memory |      score | weight
llava-single |    0 |   8 |    1 |       0.72 |   0.3% |   6.9% |       55786 |       5.77 |   1.00
@Delaunay
Copy link
Collaborator Author

Delaunay commented Oct 4, 2024

Lazy Mode

--- a/benchmarks/llava/main.py
+++ b/benchmarks/llava/main.py
@@ -124,7 +123,9 @@ def main():
             if accelerator.sync_gradients:
                 accelerator.clip_grad_norm_(model.parameters(), 1.0)
 
+            compat.mark_step()
             optimizer.step()
+            compat.mark_step()
             optimizer.zero_grad()

much slower than eager mode though

Breakdown
---------
bench        | fail |   n | ngpu |       perf |   sem% |   std% | peak_memory |      score | weight
llava-single |    0 |   8 |    1 |       0.55 |   4.9% | 115.8% |       98304 |       4.31 |   1.00

@Delaunay Delaunay added the HPU label Oct 4, 2024
@Delaunay Delaunay changed the title llava-single llava-single on HPU Oct 4, 2024
@Delaunay
Copy link
Collaborator Author

Delaunay commented Oct 4, 2024

Eager Mode + Compile

llava-single.D6
===============
  * Error codes = 1
  * 1 exceptions found
    * 1 x [Rank:0] Habana exception raised from LaunchRecipe at graph_exec.cpp:440
        | Traceback (most recent call last):
        |   File "/homes/delaunap/milabench/benchmarks/llava/main.py", line 147, in <module>
        |     main()
        |   File "/homes/delaunap/milabench/benchmarks/llava/main.py", line 121, in main
        |     outputs = model(**inputs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1535, in _wrapped_call_impl
        |     return self._call_impl(*args, **kwargs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1544, in _call_impl
        |     return forward_call(*args, **kwargs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
        |     return fn(*args, **kwargs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1535, in _wrapped_call_impl
        |     return self._call_impl(*args, **kwargs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1544, in _call_impl
        |     return forward_call(*args, **kwargs)
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/transformers/models/llava/modeling_llava.py", line 443, in forward
        |     inputs_embeds, attention_mask, labels, position_ids = self._merge_input_ids_with_image_features(
        |   File "/homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/transformers/models/llava/modeling_llava.py", line 284, in _merge_input_ids_with_image_features
        |     left_padding = not torch.sum(input_ids[:, -1] == torch.tensor(self.pad_token_id))
        | RuntimeError: [Rank:0] FATAL ERROR :: MODULE:PT_BRIDGE Exception in Lowering thread...
        | [Rank:0] FATAL ERROR :: MODULE:PT_EAGER HabanaLaunchOpPT Run returned exception....
        | The expanded size of the tensor (0) must match the existing size (1024) at non-singleton dimension 2.  Target sizes: [1, 1, 0].  Tensor sizes: [1024]
        | Exception raised from inferExpandGeometryImpl at /npu-stack/pytorch-fork/aten/src/ATen/ExpandUtils.cpp:95 (most recent call first):
        | frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xae (0x7f17f0d883fe in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/lib/libc10.so)
        | frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xf3 (0x7f17f0d322f1 in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/lib/libc10.so)
        | frame #2: at::inferExpandGeometry(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>) + 0x74e (0x7f17f22f7cce in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
        | frame #3: BroadcastOperator::AllocateAndAddSynapseNode(synapse_helpers::graph&, std::vector<c10::IValue, std::allocator<c10::IValue> >&, std::vector<habana::OutputMetaData, std::allocator<habana::OutputMetaData> > const&) + 0x232 (0x7f178cb864d2 in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #4: habana::HabanaLaunchOpPT::BuildSynapseGraph(std::shared_ptr<synapse_helpers::graph>&, habana::SynBuildCache&, bool) + 0x1ee4 (0x7f178b4dcb24 in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch_backend.so)
        | frame #5: habana::HabanaLaunchOpPT::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&, std::optional<std::vector<at::Tensor, std::allocator<at::Tensor> > >, std::optional<std::vector<std::vector<long, std::allocator<long> >, std::allocator<std::vector<long, std::allocator<long> > > > >, bool, habana::HabanaLaunchOpPipeline::PipelineCallBase&) + 0x98e (0x7f178b4ecc7e in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch_backend.so)
        | frame #6: habana::HabanaLaunchOpPipeline::LoweringTask(std::unique_ptr<habana::HabanaLaunchOpPT, std::default_delete<habana::HabanaLaunchOpPT> >&&, std::vector<c10::IValue, std::allocator<c10::IValue> >&, std::optional<std::vector<at::Tensor, std::allocator<at::Tensor> > >, std::optional<std::vector<std::vector<long, std::allocator<long> >, std::allocator<std::vector<long, std::allocator<long> > > > >) + 0xda (0x7f178b4f020a in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch_backend.so)
        | frame #7: habana::graph::GraphExec::LaunchRecipe(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::optional<std::vector<at::Tensor, std::allocator<at::Tensor> > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >) + 0xaac (0x7f178c6b195c in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #8: habana::graph::GraphExec::LaunchRecipeTask(habana::graph::GraphExec*, std::vector<c10::IValue, std::allocator<c10::IValue> >&&, std::vector<at::Tensor, std::allocator<at::Tensor> >&&, habana::graph::LaunchDynamicShapes, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >&&) + 0x24d (0x7f178c6b1f6d in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #9: habana_helpers::move_only_function_void::Wrapper<habana_helpers::ThreadPoolBase<habana_helpers::BlockingQueue, habana_helpers::move_only_function_void>::enqueue<void (&)(habana::graph::GraphExec*, std::vector<c10::IValue, std::allocator<c10::IValue> >&&, std::vector<at::Tensor, std::allocator<at::Tensor> >&&, habana::graph::LaunchDynamicShapes, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >&&), habana::graph::GraphExec*, std::vector<c10::IValue, std::allocator<c10::IValue> >, std::vector<at::Tensor, std::allocator<at::Tensor> >, habana::graph::LaunchDynamicShapes, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >, habana_helpers::move_only_function_void, true>(void (&)(habana::graph::GraphExec*, std::vector<c10::IValue, std::allocator<c10::IValue> >&&, std::vector<at::Tensor, std::allocator<at::Tensor> >&&, habana::graph::LaunchDynamicShapes, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >&&), habana::graph::GraphExec*&&, std::vector<c10::IValue, std::allocator<c10::IValue> >&&, std::vector<at::Tensor, std::allocator<at::Tensor> >&&, habana::graph::LaunchDynamicShapes&&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<double>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<double> > > >&&)::{lambda()#1}>::invoke() + 0x6e (0x7f178c6b469e in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #10: habana_helpers::ThreadPoolBase<habana_helpers::BlockingQueue, habana_helpers::move_only_function_void>::executePendingTask(habana_helpers::move_only_function_void&&) + 0x72 (0x7f178ca44752 in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #11: habana_helpers::ThreadPoolBase<habana_helpers::BlockingQueue, habana_helpers::move_only_function_void>::main_loop() + 0xbe (0x7f178ca44f4e in /homes/delaunap/hpu/results/venv/torch/lib/python3.10/site-packages/habana_frameworks/torch/lib/libhabana_pytorch2_plugin.so)
        | frame #12: <unknown function> + 0xdc253 (0x7f17f0bad253 in /lib/x86_64-linux-gnu/libstdc++.so.6)
        | frame #13: <unknown function> + 0x94ac3 (0x7f1805c4dac3 in /lib/x86_64-linux-gnu/libc.so.6)
        | frame #14: <unknown function> + 0x126850 (0x7f1805cdf850 in /lib/x86_64-linux-gnu/libc.so.6)
        | [Rank:0] Habana exception raised from LaunchRecipe at graph_exec.cpp:440

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant