-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Badge request: Total PyPI downloads #4319
Comments
We'd love to be able to support all downloads from PyPI like we do for many of our other download-related badges. However, the challenge is that the upstream API we use to get download stats from PyPI does not support the all-time download count in the API response. In order to support this, there'd need to be a viable endpoint where Shields could feasibly retrieve the data. PyPI Stats does provide an API that returns a massive response with the download count for each day since the package was first published, but I suspect that would be problematic for Shields due to having to aggregate the data. There's also the raw data in GCP, though I'm not sure how feasible that would be for Shields either. If anyone is interested in seeing this implemented, a great way to help would be to try to find such an endpoint! |
How about this website? |
For reference, here's the pepy JSON endpoint: https://api.pepy.tech/api/v2/projects/django The problem with getting our day/week/monthly stats from pypistats and using pepy for the total is they count slightly different things. Pypistats presents summary statistics excluding mirrors only, Pepy only provides stats that do include mirrors so they aren't quite a like-for-like comparison. For some packages, they can be quite different. In retrospect, I think we should have made the existing pypistats badges Another consideration here is: The PyPI Downloads badges get a lot of traffic. We know we are the single largest source of traffic to pypistats. In the last hour we sent over 8,000 requests their way, but that is by no means peak. Pepy is a volunteer run service and they've indicated in psincraian/pepy#477 that although they are happy for people to use their API they may not be able or happy to handle a large amount of traffic. Given that, I think I also wouldn't like to completely switch from pypistats to pepy for this data. We know pypistats can reasonably reliably handle the traffic we throw their way. So I think there are a few different ways we could go with this...
I think all in all, I'm in favour of 1. It is the simplest and most performant option, even if there is a bit of an apples and oranges comparison going on there. Anyone else got strong opinions on this? Given the potential amount of traffic involved, I think I'd still want to open an issue on https:/psincraian/pepy before someone works on a PR to add this. As I say:
|
we have been using it but it quite often returning 404 :( |
@chris48s Not that I'm the biggest expert on Javascript, but can't you just take the number from the part of the project page that says "Total downloads"? |
We don't do screen scraping of websites to get data for myriad reasons. We need a well formed API from which to get the data |
Just to be clear: Getting the data is not one of the issues here. Pepy exposes a json API https://api.pepy.tech/api/v2/projects/rlvoice-1 |
Unclear if this was as intended to be a response to my prior comment @chris48s, but in case it was, I'll clarify that my comment was specifically in response to my understanding of @Akul2010's comment in #4319 (comment) that was suggesting an alternative way (screen scraping) of getting the download data instead of the specific API's we've discussed on this issue (pepy and pypistats) due to some of the limitations/tradeoffs that exist with those. |
I opened an issue over on pepy about API usage psincraian/pepy#573 |
Hey, just noticed you guy's discussion here. Want to raise another issue for discussion that the current PePy API doesn't provide any abbreviation of total downloads, e.g. https://api.pepy.tech/api/v2/projects/pypots, as you can see the total downloads of pypots package is |
@WenjieDu - if you'd like a feature implemented in a service like PePy then that's best directed to that service/platform, as no such change/decision could be made by the Shields team. However, that's not something Shields actually needs or would really want. We have a strong preference for the APIs to return the raw, unabbreviated data points so that we can apply our own standard abbreviation/rounding logic for consistency across our other badges |
@calebcartwright So do you provide "your own standard abbreviation/rounding logic" as a parameter or an argument to let users round raw numbers in your service like https://shields.io/badges/dynamic-json-badge? I didn't find it in your docs. Maybe you can help me with it? Thanks. |
This is quickly getting off topic of the PyPI badge request of this issue, but the answer is "No". Use the Custom Endpoint Badge if you want to have that level of control of the message value, especially in cases where the dynamic badge query doesn't provide one's desired transformation functions/utility (e.g. #6071) |
A point that comes out of psincraian/pepy#573 (comment) A total downloads based on pepy's number (including mirrors) isn't really "PyPI total downloads" - it is "python package total downloads" (most of those downloads being from PyPI but a small number being from not-PyPI). I think that's leading me towards saying "python package downloads from pepy" shouldn't be |
It would be nice, if I can get all Downloads from PyPI since the project was uploaded and not only a specific period.
The text was updated successfully, but these errors were encountered: