Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download_lahman() failing #391

Open
double-dose-larry opened this issue Nov 20, 2023 · 9 comments · May be fixed by #449
Open

download_lahman() failing #391

double-dose-larry opened this issue Nov 20, 2023 · 9 comments · May be fixed by #449

Comments

@double-dose-larry
Copy link

Hi All,

I'm running pybaseball 2.2.7

I'm trying to run pybaseball.people() and getting the following stack trace:

---------------------------------------------------------------------------
BadZipFile                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 download_lahman()

File ~/.local/lib/python3.10/site-packages/pybaseball/lahman.py:30, in download_lahman()
     28 def download_lahman():
     29     # download entire lahman db to present working directory
---> 30     z = get_lahman_zip()
     31     if z is not None:
     32         z.extractall(cache.config.cache_directory)

File ~/.local/lib/python3.10/site-packages/pybaseball/lahman.py:25, in get_lahman_zip()
     23 elif not _handle:
     24     s = requests.get(url, stream=True)
---> 25     _handle = ZipFile(BytesIO(s.content))
     26 return _handle

File /usr/lib/python3.10/zipfile.py:1269, in ZipFile.__init__(self, file, mode, compression, allowZip64, compresslevel, strict_timestamps)
   1267 try:
   1268     if mode == 'r':
-> 1269         self._RealGetContents()
   1270     elif mode in ('w', 'x'):
   1271         # set the modified flag so central directory gets written
   1272         # even if no files are added to the archive
   1273         self._didModify = True

File /usr/lib/python3.10/zipfile.py:1336, in ZipFile._RealGetContents(self)
   1334     raise BadZipFile("File is not a zip file")
   1335 if not endrec:
-> 1336     raise BadZipFile("File is not a zip file")
   1337 if self.debug > 1:
   1338     print(endrec)

BadZipFile: File is not a zip file

I dug around and saw that the data is attempt to be retrieved from here :
https://github.com/chadwickbureau/baseballdatabank/archive/master.zip

That is leading to a dead link. Perhaps there was a change upstream.

@JSCjr
Copy link

JSCjr commented Nov 22, 2023

Similar issues - code will need update to handle new Chadwick register location and file structure (the people table has been split into multiple files).

@blue-shoes
Copy link

This is a separate issue from the Chadwick register (which I believe has been handled in PR #309 ). The issue looks like the chadwickbureau/baseballdatabank repository no longer exists, at least not publicly.

@agpolivka
Copy link

Has this issue been fixed? Dug into the code and came to the same conclusion that finally got me to this page but I don't see any follow up/fix. I've pulled the code pretty recently so I was wondering if anyone had fixed or come up with the work around.

@JSCjr
Copy link

JSCjr commented Apr 11, 2024

Sean Lahman just posted an updated version of the database files at his own site, so this could presumably be fixed by pointing the code at those files instead.

@blue-shoes
Copy link

Linking to the files on his site looks fragile to me, since it's relying on naming convention in his personal Dropbox. The file is currently called lahman_1871-2023.csv, so one assumes this is not a static file name/path.

@StuffbyYuki
Copy link

I see the same error. And looks like the file location changed as @JSCjr mentioned.

@SushiInYourFace
Copy link

I'm also seeing this error. If this isn't an important functionality or a priority to maintain, might be a good idea to just remove it instead of keeping a broken function around

@bdilday
Copy link
Contributor

bdilday commented Jun 26, 2024

note that there is a proposed fix here #435

@efitton
Copy link

efitton commented Aug 1, 2024

Would love to see this fixed in the main repo.

@mlinenweber mlinenweber linked a pull request Sep 3, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants