Skip to content

Commit

Permalink
Add ignore keyword argument to copytree (#272) (#273)
Browse files Browse the repository at this point in the history
* Add ignore keyword argument to copytree (#272)

* Add `ignore` keyword argument to `copytree`

The `ignore` argument expects a callable that returns a list of
names that will be ignored while copying. It supports
`shutil.ignore_patterns` [1] to allow using a similar interface.

Also, include proper typing according to the behaviour described in
the Python docs on `shutil.copytree` and expanding it to support the
`cloudpathlib` environment.

[1]: https://docs.python.org/3/library/shutil.html#shutil.ignore_patterns

Signed-off-by: Antonio Ossa Guerra <[email protected]>

* Add tests for `ignore` argument on `copytree`

When the argument is not used (defaults to `None`), the function
should work normally. The arguments is expected to be a callable that,
given a list of names, returns a list of ignored names to skip those
names while performing the copy.

The tests create additional files in the reference path (`p`): a
Python file (`ignored.py`) and two directories (`dir1/` and `dir2/`).
These files are ignored in two different ways, and tested separatelly:
using `shutil.ignore_patterns` and using a custom ignore function

The tests are performed by copying the tree (and ignoring the files)
and then comparing the source and destination (checking that every file
in the destination is also in the source), and asserting that the
ignored files do not exist in the destination.

Signed-off-by: Antonio Ossa Guerra <[email protected]>

Signed-off-by: Antonio Ossa Guerra <[email protected]>

* Update changelog and version

Signed-off-by: Antonio Ossa Guerra <[email protected]>
Co-authored-by: Antonio Ossa-Guerra <[email protected]>
  • Loading branch information
pjbull and aaossa authored Sep 27, 2022
1 parent 203f6aa commit b7f6010
Show file tree
Hide file tree
Showing 4 changed files with 71 additions and 4 deletions.
5 changes: 5 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# cloudpathlib Changelog

## v0.11.0 (UNRELEASED)

- API change: Add `ignore` parameter to `CloudPath.copytree` in order to match `shutil` API. ([Issue #145](https:/drivendataorg/cloudpathlib/issues/234), [PR #272](https:/drivendataorg/cloudpathlib/pull/250))


## v0.10.0 (2022-08-18)

- API change: Make `stat` on base class method instead of property to follow `pathlib` ([Issue #234](https:/drivendataorg/cloudpathlib/issues/234), [PR #250](https:/drivendataorg/cloudpathlib/pull/250))
Expand Down
36 changes: 33 additions & 3 deletions cloudpathlib/cloudpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,19 @@
_posix_flavour,
_PathParents,
)
from typing import Any, IO, Iterable, Dict, Optional, Type, TYPE_CHECKING, TypeVar, Union
from typing import (
Any,
Callable,
IO,
Iterable,
Dict,
List,
Optional,
Type,
TYPE_CHECKING,
TypeVar,
Union,
)
from urllib.parse import urlparse
from warnings import warn

Expand Down Expand Up @@ -757,6 +769,13 @@ def copytree(
self,
destination: Union[str, os.PathLike, "CloudPath"],
force_overwrite_to_cloud: bool = False,
ignore: Callable[
[
Union[str, os.PathLike, "CloudPath"],
List[Union[str, os.PathLike, "CloudPath"]],
],
collections.abc.Iterable,
] = None,
) -> Union[Path, "CloudPath"]:
"""Copy self to a directory, if self is a directory."""
if not self.is_dir():
Expand All @@ -773,16 +792,27 @@ def copytree(
"Destination path {destination} of copytree must be a directory."
)

contents = list(self.iterdir())

if ignore is not None:
ignored_names = ignore(self._no_prefix_no_drive, [x.name for x in contents])
else:
ignored_names = set()

destination.mkdir(parents=True, exist_ok=True)

for subpath in self.iterdir():
for subpath in contents:
if subpath.name in ignored_names:
continue
if subpath.is_file():
subpath.copy(
destination / subpath.name, force_overwrite_to_cloud=force_overwrite_to_cloud
)
elif subpath.is_dir():
subpath.copytree(
destination / subpath.name, force_overwrite_to_cloud=force_overwrite_to_cloud
destination / subpath.name,
force_overwrite_to_cloud=force_overwrite_to_cloud,
ignore=ignore,
)

return destination
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,5 @@ def load_requirements(path: Path):
"Source Code": "https:/drivendataorg/cloudpathlib",
},
url="https:/drivendataorg/cloudpathlib",
version="0.10.0",
version="0.11.0-alpha",
)
32 changes: 32 additions & 0 deletions tests/test_cloudpath_upload_copy.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from pathlib import Path
from shutil import ignore_patterns
from time import sleep

import pytest
Expand Down Expand Up @@ -243,3 +244,34 @@ def test_copytree(rig, tmpdir):

p.copytree(p2, force_overwrite_to_cloud=True)
assert assert_mirrored(p2, p, check_no_extra=False)

# additional files that will be ignored using the ignore argument
(p / "ignored.py").write_text("print('ignore')")
(p / "dir1" / "file1.txt").write_text("ignore")
(p / "dir2" / "file2.txt").write_text("ignore")

# cloud dir to local dir but ignoring files (shutil.ignore_patterns)
p3 = rig.create_cloud_path("new_dir3")
p.copytree(p3, ignore=ignore_patterns("*.py", "dir*"))
assert assert_mirrored(p, p3, check_no_extra=False)
assert not (p3 / "ignored.py").exists()
assert not (p3 / "dir1").exists()
assert not (p3 / "dir2").exists()

# cloud dir to local dir but ignoring files (custom function)
p4 = rig.create_cloud_path("new_dir4")

def _custom_ignore(path, names):
ignore = []
for name in names:
if name.endswith(".py"):
ignore.append(name)
elif name.startswith("dir"):
ignore.append(name)
return ignore

p.copytree(p4, ignore=_custom_ignore)
assert assert_mirrored(p, p4, check_no_extra=False)
assert not (p4 / "ignored.py").exists()
assert not (p4 / "dir1").exists()
assert not (p4 / "dir2").exists()

0 comments on commit b7f6010

Please sign in to comment.