Badge request: Total PyPI downloads #4319

JakobDev · 2019-11-11T15:17:57Z

It would be nice, if I can get all Downloads from PyPI since the project was uploaded and not only a specific period.

calebcartwright · 2019-11-21T01:44:10Z

We'd love to be able to support all downloads from PyPI like we do for many of our other download-related badges. However, the challenge is that the upstream API we use to get download stats from PyPI does not support the all-time download count in the API response.

In order to support this, there'd need to be a viable endpoint where Shields could feasibly retrieve the data. PyPI Stats does provide an API that returns a massive response with the download count for each day since the package was first published, but I suspect that would be problematic for Shields due to having to aggregate the data. There's also the raw data in GCP, though I'm not sure how feasible that would be for Shields either.

If anyone is interested in seeing this implemented, a great way to help would be to try to find such an endpoint!

Akul2010 · 2023-08-15T18:44:45Z

How about this website?
https://www.pepy.tech/

chris48s · 2023-08-17T20:18:55Z

For reference, here's the pepy JSON endpoint: https://api.pepy.tech/api/v2/projects/django

The problem with getting our day/week/monthly stats from pypistats and using pepy for the total is they count slightly different things. Pypistats presents summary statistics excluding mirrors only, Pepy only provides stats that do include mirrors so they aren't quite a like-for-like comparison. For some packages, they can be quite different.

In retrospect, I think we should have made the existing pypistats badges /pypistats/(dm|dw|dd) rather than /pypi/(dm|dw|dd). That would have kind of made it easier to just add pepy as another service. There isn't really a good reason why one source is any more valid than the other. They're both third parties making slightly different assumptions over the same source data. For historical reasons we kind of blessed pypistats as the "official" one, but that ship has now sailed.

Another consideration here is: The PyPI Downloads badges get a lot of traffic. We know we are the single largest source of traffic to pypistats. In the last hour we sent over 8,000 requests their way, but that is by no means peak. Pepy is a volunteer run service and they've indicated in psincraian/pepy#477 that although they are happy for people to use their API they may not be able or happy to handle a large amount of traffic. Given that, I think I also wouldn't like to completely switch from pypistats to pepy for this data. We know pypistats can reasonably reliably handle the traffic we throw their way.

So I think there are a few different ways we could go with this...

Mix and match 1: Add a "PyPI Total Downloads" badge using pepy. Change nothing else. Accept that the day/week/monthly badges are counting a slightly different thing from the total downloads badge. Maybe it doesn't matter and I should just stop being a pedant about mirrors and move on.
Mix and match 2: Add a "PyPI Total Downloads" badge using pepy. Switch to using a different API endpoint on pypistats to include mirrors on the day/week/monthly badges so they are more comparable. I haven't tested this, but I think if we switched from using https://pypistats.org/api/packages/django/recent to using https://pypistats.org/api/packages/django/overall we could assemble with_mirrors totals. this would be conceptually more consistent, but its a bigger API response to download and parse each time, and we'd have to sum them up ourselves. Not the end of the world, but as I say we serve a lot of these badges.
Add pepy as a seperate service.

I think all in all, I'm in favour of 1. It is the simplest and most performant option, even if there is a bit of an apples and oranges comparison going on there. Anyone else got strong opinions on this?

Given the potential amount of traffic involved, I think I'd still want to open an issue on https:/psincraian/pepy before someone works on a PR to add this. As I say:

they are a volunteer run service
this specific badge has the potential to become quite popular
although I don't think they explicitly rate limit, [Document] pepy api endpoint's psincraian/pepy#477 indicates that a large amount of traffic could be a problem

Borda · 2023-08-18T14:55:25Z

How about this website? pepy.tech

we have been using it but it quite often returning 404 :(

Akul2010 · 2023-08-18T19:56:15Z

@chris48s Not that I'm the biggest expert on Javascript, but can't you just take the number from the part of the project page that says "Total downloads"?

(SHowed what I'm talking about for one of my own packages)

calebcartwright · 2023-08-18T20:20:04Z

@chris48s Not that I'm the biggest expert on Javascript, but can't you just take the number from the part of the project page that says "Total downloads"?

We don't do screen scraping of websites to get data for myriad reasons. We need a well formed API from which to get the data

chris48s · 2023-08-18T21:10:08Z

Just to be clear: Getting the data is not one of the issues here. Pepy exposes a json API https://api.pepy.tech/api/v2/projects/rlvoice-1

calebcartwright · 2023-08-18T22:09:17Z

Just to be clear: Getting the data is not one of the issues here. Pepy exposes a json API https://api.pepy.tech/api/v2/projects/rlvoice-1

Unclear if this was as intended to be a response to my prior comment @chris48s, but in case it was, I'll clarify that my comment was specifically in response to my understanding of @Akul2010's comment in #4319 (comment) that was suggesting an alternative way (screen scraping) of getting the download data instead of the specific API's we've discussed on this issue (pepy and pypistats) due to some of the limitations/tradeoffs that exist with those.

chris48s · 2023-08-20T14:57:08Z

I opened an issue over on pepy about API usage psincraian/pepy#573

WenjieDu · 2023-08-25T15:40:25Z

Hey, just noticed you guy's discussion here. Want to raise another issue for discussion that the current PePy API doesn't provide any abbreviation of total downloads, e.g. https://api.pepy.tech/api/v2/projects/pypots, as you can see the total downloads of pypots package is 26159, rather than 26k. Although we can use data from pepy to build dynamic JSON badges with shields like , for packages with large amount of downloads, their badges may have big width. May request PePy to provide abbreviation num of downloads?

calebcartwright · 2023-08-25T15:46:24Z

@WenjieDu - if you'd like a feature implemented in a service like PePy then that's best directed to that service/platform, as no such change/decision could be made by the Shields team.

However, that's not something Shields actually needs or would really want. We have a strong preference for the APIs to return the raw, unabbreviated data points so that we can apply our own standard abbreviation/rounding logic for consistency across our other badges

WenjieDu · 2023-08-25T15:55:31Z

@calebcartwright So do you provide "your own standard abbreviation/rounding logic" as a parameter or an argument to let users round raw numbers in your service like https://shields.io/badges/dynamic-json-badge? I didn't find it in your docs. Maybe you can help me with it? Thanks.

calebcartwright · 2023-08-25T16:00:34Z

@calebcartwright So do you provide "your own standard abbreviation/rounding logic" as a parameter or an argument to let users round raw numbers in your service like https://shields.io/badges/dynamic-json-badge? I didn't find it in your docs. Maybe you can help me with it? Thanks.

This is quickly getting off topic of the PyPI badge request of this issue, but the answer is "No".

Use the Custom Endpoint Badge if you want to have that level of control of the message value, especially in cases where the dynamic badge query doesn't provide one's desired transformation functions/utility (e.g. #6071)

chris48s · 2023-08-29T19:16:08Z

A point that comes out of psincraian/pepy#573 (comment)

A total downloads based on pepy's number (including mirrors) isn't really "PyPI total downloads" - it is "python package total downloads" (most of those downloads being from PyPI but a small number being from not-PyPI).

I think that's leading me towards saying "python package downloads from pepy" shouldn't be /pypi/dt/:packageName.

JakobDev added the service-badge Accepted and actionable changes, features, and bugs label Nov 11, 2019

paulmelnikow changed the title ~~Get Downloads from PyPI without Period~~ Badge request: Total PyPI downloads Apr 5, 2020

paulmelnikow added the needs-upstream-help Not actionable without help from a service provider label Apr 5, 2020

chris48s mentioned this issue Jan 23, 2021

pypi total downloads badge? #6097

Closed

jaiakash mentioned this issue Apr 6, 2022

add: download badges in readme interviewstreet/ghs#5

Merged

chris48s mentioned this issue Aug 20, 2023

API endpoint usage psincraian/pepy#573

Closed

chris48s mentioned this issue Sep 11, 2023

add python package total downloads from [pepy] badge #9564

Merged

calebcartwright closed this as completed in #9564 Sep 26, 2023

evelina-gudauskayte mentioned this issue Jun 1, 2024

[SYSTEMDS-3529] codecov badge + PyPI downloads badge apache/systemds#2029

Closed

LeeDongGeon1996 mentioned this issue Aug 7, 2024

PyPI downloads badge facioquo/stock-indicators-python#391

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Badge request: Total PyPI downloads #4319

Badge request: Total PyPI downloads #4319

JakobDev commented Nov 11, 2019

calebcartwright commented Nov 21, 2019 •

edited

Loading

Akul2010 commented Aug 15, 2023

chris48s commented Aug 17, 2023 •

edited

Loading

Borda commented Aug 18, 2023

Akul2010 commented Aug 18, 2023

calebcartwright commented Aug 18, 2023

chris48s commented Aug 18, 2023

calebcartwright commented Aug 18, 2023

chris48s commented Aug 20, 2023

WenjieDu commented Aug 25, 2023

calebcartwright commented Aug 25, 2023

WenjieDu commented Aug 25, 2023

calebcartwright commented Aug 25, 2023

chris48s commented Aug 29, 2023

Badge request: Total PyPI downloads #4319

Badge request: Total PyPI downloads #4319

Comments

JakobDev commented Nov 11, 2019

calebcartwright commented Nov 21, 2019 • edited Loading

Akul2010 commented Aug 15, 2023

chris48s commented Aug 17, 2023 • edited Loading

Borda commented Aug 18, 2023

Akul2010 commented Aug 18, 2023

calebcartwright commented Aug 18, 2023

chris48s commented Aug 18, 2023

calebcartwright commented Aug 18, 2023

chris48s commented Aug 20, 2023

WenjieDu commented Aug 25, 2023

calebcartwright commented Aug 25, 2023

WenjieDu commented Aug 25, 2023

calebcartwright commented Aug 25, 2023

chris48s commented Aug 29, 2023

calebcartwright commented Nov 21, 2019 •

edited

Loading

chris48s commented Aug 17, 2023 •

edited

Loading