Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wget throws "Unable to delete" and "Input/output error" messages when mirroring a site #2012

Closed
DJviolin opened this issue Apr 26, 2017 · 7 comments

Comments

@DJviolin
Copy link

  • Your Windows build number:

Windows 10.0.15063

Ubuntu 16.04.2 LTS

  • What you're doing and what's happening:

When I try to run the following wget command to generate static files from my local Wordpress site, it throws "Unable to delete" and "Input/output error" messages at the end. The command runs without problem in MSYS2.

wget \
  --mirror \
  --adjust-extension \
  --page-requisites \
  --convert-links \
  --span-hosts \
  --domains=127.0.0.1 \
  \
  --execute robots=off \
  --continue \
  --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:53.0) Gecko/20100101 Firefox/53.0" \
  --html-extension \
  --no-host-directories \
  --no-cookies \
  --no-cache \
  \
  --content-disposition \
  \
  --restrict-file-names=nocontrol \
  --header="accept-encoding: gzip" \
  --header="Accept-Charset: utf-8" \
  \
  --directory-prefix=./static \
  127.0.0.1/public_html/mysite/
  • What's wrong / what should be happening instead:

A few last lines:

FINISHED --2017-04-26 19:09:23--
Total wall clock time: 37s
Downloaded: 280 files, 38M in 0.3s (127 MB/s)
Converting links in ./static/127.0.0.1/public_html/mysite/tag/kecskemet/index.html... Unable to delete ‘./static/127.0.0.1/public_html/mysite/tag/kecskemet/index.html’: Input/output error

The downloaded data size also not correct: it should be 184M instead of 38M. The correct output in MSYS2:

FINISHED --2017-04-26 19:36:47--
Total wall clock time: 54s
Downloaded: 2081 files, 184M in 1.3s (137 MB/s)
Converting links in ./static/public_html/mysite/tag/kecskemet/index.html... 29-1
Converting links in ./static/public_html/mysite/tag/eskuvo/index.html... 29-1
@rodrymbo
Copy link

Input/output error is pretty generic. Sounds like the strace step from the template might be helpful...

There was an earlier issue #338 where wget could not update the timestamp properly. Maybe yours is similar?

Would also be good to know whether this is in the /mnt/ (Windows) filesystem, or the linuxy filesystem (e.g. /home), and what kind of permissions are there.

If you are trying to update an existing directory tree created by mysy2, there might be permission oddities, so it might be helpful to confirm that.

@DJviolin
Copy link
Author

DJviolin commented Apr 26, 2017

strace.zip

So far in the wget output I only found these lines regarding timestamps:

Last-modified header missing -- time-stamps turned off.

This line is same in MSYS2.

Yes, this is in the Windows filesystem:

root@LANTI-DESKTOP:/mnt/d/xampp/htdocs/public_html/mysite# ls -al
total 2119686
drwxrwxrwx 0 root root    4096 Apr 26 20:26 .
drwxrwxrwx 0 root root    4096 Apr 25 17:16 ..
-rwxrwxrwx 1 root root    1742 Apr 26 19:54 find.sh
drwxrwxrwx 0 root root    4096 Apr 26 20:24 static
-rwxrwxrwx 1 root root     820 Apr 26 20:01 static.sh

The static folder created by the script itself, this is in the shell script before the wget command:

#!/bin/bash

set -e

rm -rf ./static
mkdir ./static

wget \
  ...

@therealkenc
Copy link
Collaborator

Thanks for the strace. Here's the relevant fail. WSL is trying to unlink() a mmap()ed file. Which doesn't work in DrvFS (aka /mnt/c). This is perennial #966 #1357 etc. Do whatever you are doing in /home and you'll probably be okay.

write(2, "Converting links in static/publi"..., 110) = 110
open("static/public_html/lantosistvan/wp-content/themes/matte/css/icons/css/font-awesome.css", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0777, st_size=26314, ...}) = 0
mmap(NULL, 26314, PROT_READ|PROT_WRITE, MAP_PRIVATE, 4, 0) = 0x7f5159b52000
close(4)                                = 0
unlink("static/public_html/lantosistvan/wp-content/themes/matte/css/icons/css/font-awesome.css") = -1 EIO (Input/output error)
write(2, "Unable to delete \342\200\230static/publi"..., 130) = 130
munmap(0x7f5159b52000, 26314)           = 0

@DJviolin
Copy link
Author

Thank You the info. Probably it will be a fix for this in the future?

@therealkenc
Copy link
Collaborator

Quote from #966 is they "...realize this is a very unfortunate limitation, and we're working on ways to improve our file system support to fix this issue in a future insider build". Curious to see what they come up with. It seems to me a tough nut to crack without considerable cooperation from the NT kernel side people, because on DrvFS that file might have been opened by a win32 process. But if they deliberately limit the problem space to files only ever opened by WSL in /mnt/c I can think of a way they might do it. The issue is marked as a bug (contrast feature/backlog/bydesign), so the team is taking the problem seriously. It is just not a straightforward fix.

@maeni70
Copy link

maeni70 commented Jul 5, 2018

I ran into the same issue. Any updates?

@therealkenc
Copy link
Collaborator

This is perennial #966 #1357 etc. Do whatever you are doing in /home and you'll probably be okay.

This submission was de-facto dupe #966. If the EIO on unlink(2) persists on 19043 please open a new ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants