Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Distributed VQA metric #250

Merged
merged 9 commits into from
Apr 15, 2021
Merged

Distributed VQA metric #250

merged 9 commits into from
Apr 15, 2021

Conversation

dirkgr
Copy link
Member

@dirkgr dirkgr commented Apr 14, 2021

No description provided.

@dirkgr dirkgr requested a review from AkshitaB April 14, 2021 23:30
)


def multiple_runs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the global_distributed_metric function to take the number_of_runs as a parameter instead of defining a new function? The default value could just be 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. It's been a loooong time since I had a change that affected only a single repo ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dirkgr
Copy link
Member Author

dirkgr commented Apr 15, 2021

😟 On my mac that test segfaults. I was hoping it was just a mac/python thing. But it looks like it's also happening on the test servers.

@dirkgr
Copy link
Member Author

dirkgr commented Apr 15, 2021

This is the error:

terminate called after throwing an instance of 'gloo::EnforceNotMet'
  what():  [enforce fail at /opt/conda/conda-bld/pytorch_1616554788289/work/third_party/gloo/gloo/transport/tcp/pair.cc:490] op.preamble.length <= op.nbytes. 8 vs 4

@dirkgr
Copy link
Member Author

dirkgr commented Apr 15, 2021

Looks like this happens when you try to sum tensors with different data types?

Why do we put up with APIs like that?

Edit: The phrasing of the error message leads me to believe that if the two data types had been the same size, it would have just added them anyways, re-interpreting the raw bits of the data as another type.

There are _multiple_ labels per instance. That's the whole point of this metric.
@dirkgr dirkgr enabled auto-merge (squash) April 15, 2021 01:44
@dirkgr
Copy link
Member Author

dirkgr commented Apr 15, 2021

@AkshitaB, this is ready for another review.

@dirkgr dirkgr merged commit 419bc90 into main Apr 15, 2021
@dirkgr dirkgr deleted the VqaMetric branch April 15, 2021 20:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants