-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blacklist requests that are duplicates of existing resources or bound to fail #28
Comments
Can you move your comment to #25 and close this? This is the scraper's repo. |
@rgaudin Moved it but I'd keep it open as this ticket is a little bit different. |
This one's better ; closing the other one but the problem raised there remains: where do we point to for stuff that we know exists? |
Is your question "in case there are several versions of the same zim" (e.g., Wikipedia mini/nopic/maxi)? The basic assumption here is that zimit provides a copy of the real thing, so we should send them the |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
See also #33 |
Following openzim/zimit#113, we should think about implementing a fairly easily editable list (hosted on drive.kiwix.org?) of blacklisted sites that can not be requested on zimit, e.g.
It's probably the matter of a separate ticket, but requests for websites we already have a scraper for (wikipedia, stackoverflow, etc.) should also be soft blocked and the user offered a direct link to the zim file.
The text was updated successfully, but these errors were encountered: