Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making MTurk cleanup run daily per requester #918

Merged
merged 3 commits into from
Oct 17, 2022

Conversation

JackUrb
Copy link
Contributor

@JackUrb JackUrb commented Oct 12, 2022

Overview

It very frequently comes up that Mephisto users and workers are left in positions where tasks seem to be broken. Usually this is the result of stale tasks, which surface from hasty shutdowns or strange expiration issues through botocore. The mephisto scripts mturk cleanup script exists to alleviate this problem, but given how frequently the issue arises I think it makes sense to build this functionality into the Mephisto run script when using an MTurk requester.

Implementation

  • Adds a new try_prerun_cleanup script to the mturk_utils that queries for presumed active (or broken HITs) and surfaces these if it has been greater than 24 hours since the last time one of these queries has been run for the given requester. It allows the user to select which if any HITs need disposing, then continues. It updates a file in ~/.mephisto/ to keep track of these times.
  • Augments the augment_config_from_db helper function to run try_prerun_cleanup whenever an MTurk requester is being used.
  • Adds a new option to the cleanup script to remove old tasks (> 2 weeks).

Testing

I cleaned up some of my old tasks on one of my requesters using the new (o)ld option.
I tried launching a task using one of my requesters:
First time

python static_test_script.py mephisto/provider=mturk mephisto.provider.requester_name=MYREQUESTER
...
It's been more than a day since you last ran a job with MYREQUESTER, checking for outstanding tasks...
That took a while! You may want to run `mephisto scripts cleanup mturk` later to clear out some of the older HIT types.

The requester MYREQUESTER has 2 outstanding HIT types, with 100 suspected active or broken HITs.
This may include tasks that are still in-flight, but also tasks have been improperly shut down and need cleanup.
 Please review and dispose HITs below.
HIT TITLE: ONE OF MY TASKS
LAUNCH TIME: 10/12/2022, 17:24:34
HIT COUNT: 20
Should we cleanup this hit type? (y)es or (n)o: 
>> n
HIT TITLE: ANOTHER ONE OF MY TASKS
LAUNCH TIME: 10/12/2022, 17:16:29
HIT COUNT: 80
Should we cleanup this hit type? (y)es or (n)o: 
>> y
Enter anything to confirm removal of the following HITs:
80 hits from 10/12/2022, 17:16:29 for HIT Type: ANOTHER ONE OF MY TASKS
# ctrl C here because I didn't want to remove this

After first run

python static_test_script.py mephisto/provider=mturk mephisto.provider.requester_name=MYREQUESTER
...
This task is going to launch live on mturk, press enter to continue: 

Note

One can disable this functionality until a future date by updating the "last reviewed time" for a given requester to sometime in the future. If it becomes an issue, we will provide an option to (d)efer the check, however I think doing this check is important for worker quality.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2022
@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2022

Codecov Report

Base: 64.61% // Head: 64.18% // Decreases project coverage by -0.43% ⚠️

Coverage data is based on head (ac5ed05) compared to base (fdf633d).
Patch coverage: 9.23% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #918      +/-   ##
==========================================
- Coverage   64.61%   64.18%   -0.44%     
==========================================
  Files         108      108              
  Lines        9329     9392      +63     
==========================================
  Hits         6028     6028              
- Misses       3301     3364      +63     
Impacted Files Coverage Δ
...phisto/abstractions/providers/mturk/mturk_utils.py 15.70% <9.23%> (-1.97%) ⬇️
mephisto/abstractions/architects/mock_architect.py 88.23% <0.00%> (-2.62%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@JackUrb JackUrb merged commit 903e90f into main Oct 17, 2022
@JackUrb JackUrb deleted the force-cleanup-script-daily branch October 17, 2022 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants