Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create zim file for an old, but unusual site #395

Open
gordon-matt opened this issue Sep 15, 2024 · 4 comments
Open

Cannot create zim file for an old, but unusual site #395

gordon-matt opened this issue Sep 15, 2024 · 4 comments
Assignees
Labels
bug custom_fix Problem occurs on a specific website which needs a custom fix (fuzzy rule, custom behavior, ...)

Comments

@gordon-matt
Copy link

There's a site I tried to "zim", which seems like it should be a relatively small site, but I have not been able to create a successful zim file either with the docker image I am using that's worked for other sites nor with https://zimit.kiwix.org/.. In both cases I end up with a failure.

The site is a very old one - a kind of map for an old online game: http://enbmaps.de/?breite=1920&hoehe=927

I figure it has something to do with the fact that those querystring parameters keep changing every time the size of the browser window changes. I guess ZimIt doesn't know how to handle it, but that's just conjecture.. I have no idea what's really going on.

@benoit74
Copy link
Collaborator

Please provide the logs (at least last ~hundred lines), without it, it is impossible to help explain the issue. Maybe it will not be sufficient, but it is a per-requisites.

@gordon-matt
Copy link
Author

gordon-matt commented Sep 15, 2024

Yeah, I should've added more detail. I did this a while back and have been meaning to log an issue, but didn't get around to it until now. I'm going to try rerun ZimIt again and see what happens.. but to clarify what I saw previously:

  1. Running ZimIt myself: It wouldn't end.. it just kept going and eventually I stopped it after 3 days. I figure it must've been in some infinite loop or something, because there's no way a site that small should take that long.

  2. Running it via zimit.kiwix.org: it tonly took an hour or 2 if I remember right.. and I did get a ZIM file to download. However, I got this error:

image

That's from the ZIM still loaded in my Kiwix server.

Anyway, I will try again and then check the logs and provide them here, as requested.

NOTE: This is the docker command I used for running it locally:

docker run -v /volume1/docker/kiwix/zim:/output
--shm-size=1gb ghcr.io/openzim/zimit zimit
--url http://enbmaps.de/
--name enbmaps
--workers 2
--waitUntil domcontentloaded

@gordon-matt
Copy link
Author

I ran it with zimit.kiwix.org and got a 9MB file. I loaded it in Kiwix and this time I see it show for a split second and then give an error message. Got a video for you:

Load.then.fail.mp4

I could attempt to try ZimIt myself locally again, but like I said, last time it took 3 days and thinking about it now, I couldn't even view the logs, because there were too many and the Container Manager on my Synology NAS must've been overwhelmed, as it just refused to show any logs at all at that point. That's when I gave up and just cancelled it.

@benoit74 benoit74 added custom_fix Problem occurs on a specific website which needs a custom fix (fuzzy rule, custom behavior, ...) and removed recipe labels Sep 17, 2024
@benoit74
Copy link
Collaborator

OK, so this is probably the kind of website where it is hard (if not impossible) to create a ZIM based on zimit, because it relies a lot of dynamic loading of web resources the crawler is not capable to fetch easily:

  • home page URL changes dynamically as soon as page is fully loaded to adapt to screen resolution (with width and height parameters passed in the URL as query parameters)
  • many icons as dynamically loaded once you hover a place on the map

While there might be some solutions to programmatically work around these problems, this is definitely not something easy to do / explain. As-is zimit is not capable to create a ZIM of such a website, and the solution will be specific to this website, so I will flag it as custom_fix. Be aware there is very little chance this is solved in the coming months. Thank you for reporting anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug custom_fix Problem occurs on a specific website which needs a custom fix (fuzzy rule, custom behavior, ...)
Projects
None yet
Development

No branches or pull requests

3 participants