Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated Zenodo uploader #148

Open
graeme-winter opened this issue Jan 13, 2020 · 2 comments
Open

Automated Zenodo uploader #148

graeme-winter opened this issue Jan 13, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@graeme-winter
Copy link
Contributor

  • DIALS Regression Data version: current
  • Python version: 3.6
  • Operating System: UNIX based

Description

Propose to add an automated zenodo data uploader which could also generate the appropriate JSON text for the new data set - there is a REST API which appears to work simply enough. Will require a user generate an upload token using instructions at:

https://zenodo.org/account/settings/applications/tokens/new/

What I Did

import requests
import os
import sys
import pprint

# get yourself an access token from:
#
# https://zenodo.org/account/settings/applications/tokens/new/

ACCESS_TOKEN = "aaaaaaaaa"

headers = {"Content-Type": "application/json"}
r = requests.post(
    "https://zenodo.org/api/deposit/depositions",
    params={"access_token": ACCESS_TOKEN},
    json={},
    headers=headers,
)
print(r.status_code)
print(r.json())

d_id = r.json()["id"]

for directory in sys.argv[1:]:
    for filename in os.listdir(directory):
        print(filename)
        data = {"name": filename}
        files = {"file": open(os.path.join(directory, filename), "rb")}
        r = requests.post(
            "https://zenodo.org/api/deposit/depositions/%s/files" % d_id,
            params={"access_token": ACCESS_TOKEN},
            data=data,
            files=files,
        )
        pprint.pprint(r.json())

allows automated upload of every file in a directory, as an example - the token can have permission to complete the upload and publish, but in my test case I did not test this out, just used it to upload 3,450 files.

@graeme-winter graeme-winter added the enhancement New feature or request label Jan 13, 2020
@graeme-winter
Copy link
Contributor Author

Turns out that this jams up / starts pulling HTTP 500 errors after ~1,000 or so files - chatting to Zenodo developers about this. Should probably work out a way to incrementally upload data sets.

@graeme-winter
Copy link
Contributor Author

Ah, they know they have some n² loops or something in the way they handle things and are looking to add an explicit limit to the number of files in a data set. May need a better way to do this.

One idea which occurs to me is whether we can make data public in iCAT? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant