Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for remote examples #800

Merged
merged 27 commits into from
Jul 15, 2022
Merged

Add tests for remote examples #800

merged 27 commits into from
Jul 15, 2022

Conversation

Etesam913
Copy link
Contributor

@Etesam913 Etesam913 commented Jun 18, 2022

Summary

Adds Cypress tests for the remote_procedure/template, remote_procedure/mnist, and remote_procedure/toxicity_detection examples.

Three GitHub action jobs are added. One for each of the remote_procedure examples. They just install the necessary dependencies and run the cypress tests.

Video of Tests

Toxicity Detection

toxicity-detection-tests.mov

Template

template-tests.mov

Mnist(Updated with drawing capability)

mnist-new-test.mov

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2022
@Etesam913 Etesam913 changed the title Add tests for remote examples [WIP] Add tests for remote examples Jun 18, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 18, 2022

Codecov Report

Merging #800 (93c6311) into main (20f87e1) will decrease coverage by 0.05%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #800      +/-   ##
==========================================
- Coverage   64.69%   64.63%   -0.06%     
==========================================
  Files         107      107              
  Lines        9259     9259              
==========================================
- Hits         5990     5985       -5     
- Misses       3269     3274       +5     
Impacted Files Coverage Δ
...tractions/architects/channels/websocket_channel.py 67.96% <0.00%> (-8.60%) ⬇️
mephisto/data_model/unit.py 78.14% <0.00%> (+0.54%) ⬆️
mephisto/data_model/assignment.py 61.71% <0.00%> (+3.90%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20f87e1...93c6311. Read the comment docs.

@Etesam913 Etesam913 changed the title [WIP] Add tests for remote examples Add tests for remote examples Jun 24, 2022
Comment on lines +27 to +33
cy.get('[data-cy="toxicity-alert"]', { timeout: 25000 }).as(
"toxicityAlert"
);
cy.get("@toxicityAlert").contains(
'The statement, "I hate bob!," has a toxicity of:'
);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A long timeout is required for this as the detoxify model takes a long time to run on the GitHub action (they have slow computers).

@Etesam913 Etesam913 requested a review from JackUrb July 14, 2022 18:47
Comment on lines 56 to 60
cy.get('[data-cy="canvas-mouse-down-container-1"]')
.trigger("mouseover")
.trigger("mousedown", 100, 20)
.trigger("mousedown", 120, 200)
.trigger("mouseup", 120, 200);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This draws a 1, but in the GitHub action it seems to sometimes think it is a 8 leading to the test failing about 1 out of every 4 times. Might make more sense to draw a number that is clearer to the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels kinda weird though because given the exact same mouse down values I feel like the model should give me the exact same result, but I guess that is not the case. AI stuff I guess 🤷

Copy link
Contributor Author

@Etesam913 Etesam913 Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the 1 to a 3, hopefully this is more consistent

Edit:
Seems to be consistent, haven't had any github actions fail

@Etesam913
Copy link
Contributor Author

I don't really know if there is a way to alleviate this, but in some of the tests I click the submit button. This makes locally running the tests a second time fail. The task understandably isn't expected to work after it has been submitted. One workaround is to just comment out the line where you click the submit button in the test. This would allow you to run the test on the task many times without it failing because of submission.

@JackUrb
Copy link
Contributor

JackUrb commented Jul 15, 2022

I don't really know if there is a way to alleviate this, but in some of the tests I click the submit button. This makes locally running the tests a second time fail. The task understandably isn't expected to work after it has been submitted. One workaround is to just comment out the line where you click the submit button in the test. This would allow you to run the test on the task many times without it failing because of submission.

Another possible approach is to launch multiple tasks, and have a "submit" be the last step of a test, then other tests can use a different worker_id in the URL.

Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall these tests make sense to me!

@Etesam913
Copy link
Contributor Author

Etesam913 commented Jul 15, 2022

I don't really know if there is a way to alleviate this, but in some of the tests I click the submit button. This makes locally running the tests a second time fail. The task understandably isn't expected to work after it has been submitted. One workaround is to just comment out the line where you click the submit button in the test. This would allow you to run the test on the task many times without it failing because of submission.

Another possible approach is to launch multiple tasks, and have a "submit" be the last step of a test, then other tests can use a different worker_id in the URL.

It may not even be related to what I said to be honest, not exactly sure

Here's a video of the problem (occurs at the end).

Screen.Recording.2022-07-15.at.1.35.41.PM.mov

I think if you submit the task enough times it will fail or something like that.

@JackUrb
Copy link
Contributor

JackUrb commented Jul 15, 2022

Ah okay so the backend server ends up turning off as it thinks it's "finished" collecting tasks. I don't think we have a great way of dealing with this outside of just launching a "reasonable" amount to cover all of the testing, and then note that too many submissions will eventually lead to tests failing once the server shuts down.

Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good to me, excited to have these tests in!

@Etesam913 Etesam913 merged commit 1c14cf3 into main Jul 15, 2022
@JackUrb JackUrb deleted the add-tests-for-remote-examples branch July 15, 2022 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants