Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve path join test assertion messages and fix minor test bugs #1528

Merged
merged 8 commits into from
Aug 19, 2024

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Aug 19, 2024

This fixes inaccurate assertion messages in the path-joining tests, improves some messages that were okay but capable of being clarified, and renames a couple of variables to avoid making it seem like relative paths are absolute. Most of the changes are to the Windows tests. This relates to #1374 (review).

It also corrects a few tests where the same thing was asserted multiple times with a different message rather than asserting the subtly different claim that the message showed was intended, or where it tested paths with a different number of backslashes than the context and message show were meant.

Finally, though this is only a refactoring, it changes all string literals with backslash-containing paths to be raw string literals. I think this makes it easier and also faster to understand what paths are being tested, and generally improves readability a bit.

(Sometimes it makes sense to write all Windows paths as raw string literals, even those without backslashes, but I didn't do that, because in this case I think it would produce an effect that, one way or another, would be confusing or at least feel strange, when comparing corresponding tests in the Windows and non-Windows build configurations.)

Several assertion messages had contained expressions of surprise at the effects. When rewording made this clear, I removed such wording. Elsewhere I kept it. In one place, I even added it: I do not know why concatenating \ and \\localhost on Windows produces \localhost, and it seems like the intent had even been to assert that it produces \\localhost (because it characterized the result as being UNC). But it does produce \localhost for some reason. Hopefully the message there can be improved further, but unless more is known about that now, I think my characerization as "strangely, single-backslashed host" may still be an improvement.

Some more details are available in commit messages and, in what may be an easier form to work with, in review comments I've posted below on particular changes.

This rewrites the custom message in that assertion so it states the
applicable semantics and avoids claiming that `c:relative` and
`c:\relative` are equivalent, which is not generally, nor usually,
the case.

(The message was also written with an unescaped `\r` in a non-raw
string literal, but a literal `\r` was intended. The new wording
no longer includes an explicit path, but if that is added then the
problem can be avoided by writing `\\r` or writing a raw string.)

This relates to:

- https:/Byron/gitoxide/pull/1374/files/e955770c0b5c06a6c8518a06df4aa0cc3b506f16#r1721045107
Although the path `C:` is relative because it indicates the current
directory on the C: drive, it still "takes over" when concatenated
after at most if not all paths (including absolute paths).

When the paths it is concatenated with are absolute paths with
drive letters other than C:, this behavior, while not obvious, is
conceptually clear such that no other behavior could be correct.
This is because the (drive-specific) current directory on the C:
drive is not located on another drive, and therefore not found
under any path that specifies a drive that is not C:.

(There are other situations where `C:` on the right takes
precedence where the explanation is less elegant, but they are
mostly not relevant to the assertions currently present here.)

This relates to:

- https:/Byron/gitoxide/pull/1374/files/e955770c0b5c06a6c8518a06df4aa0cc3b506f16#r1721055278
- https:/Byron/gitoxide/pull/1374/files/e955770c0b5c06a6c8518a06df4aa0cc3b506f16#r1721079186
Since the paths `/absolute` and `\absolute` are not absolute paths
on Windows.
This abbreviates the significance of `\\localhost` more
specifically, as win-absolute-unc-host rather than win-absolue-unc.

It also points out what seems unusual about joining `\` and
`\\localhost` to get `\localhost` with just one backslash. This
message should be further revised to state why this behavior
happens or otherwise describe it clearly, once known. (Or if this
turns out to be a bug in the standard library, then that can be
stated, with an issue link either in the code or commit message.)

This relates to:

- https:/Byron/gitoxide/pull/1374/files/e955770c0b5c06a6c8518a06df4aa0cc3b506f16#r1721079809
So they test what they strongly appear to be intending to test: the
behavior of a UNC-style path `\\localhost`. They were instead
testing `\localhost` because `\\` in a non-raw string becomes a
single backslash.

Although the symbolic representations in the assertion messages
were clear enough to identify what the tests intended to test, this
also adjusts those slightly so they remain comparable to the
corresponding Windows tests whose assertion messages were recently
changed.
Copy link
Member Author

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These identify specific comments in #1374 (review) to which these changes relate.

gix-fs/tests/stack/mod.rs Show resolved Hide resolved
gix-fs/tests/stack/mod.rs Show resolved Hide resolved
gix-fs/tests/stack/mod.rs Show resolved Hide resolved
gix-fs/tests/stack/mod.rs Show resolved Hide resolved
@EliahKagan EliahKagan mentioned this pull request Aug 19, 2024
23 tasks
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for digging into this again and for the clarifications.

Indeed, these tests only exists to 'test' the std::path::Path implementation and get a feeling what happens in varying scenarios. Hopefully having written these down gives enough confidence that joining two relative paths won't accidentally create an absolute path, which could certainly be exploited.

@Byron Byron merged commit 5ac3abb into GitoxideLabs:main Aug 19, 2024
19 checks passed
@EliahKagan EliahKagan deleted the join branch August 19, 2024 08:39
@EliahKagan
Copy link
Member Author

EliahKagan commented Aug 23, 2024

Thanks a lot for digging into this again and for the clarifications.

I had not mentioned why I did not also add a clarifying comment for this (which was present before, not introduced here, but which I had considered clarifying here):

https:/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L1

I had originally planned to comment that this was a workaround for rust-lang/rust-clippy#12244. This would also add another source of information about how paths that start with / or \ on Windows are not really absolute, except when they start with \\. Therefore I didn't further clarify that in any other ways, except through small adjustments to the custom assertion messages.

I didn't include this because the clippy::join_absolute_paths suppression does not have an affect on that, though it might once have, in an earlier draft. So it would be wrong, or at least misleading, to say it is a workaround for that clippy issue.

In the tests, the leading-/ and leading-\ paths on Windows are placed in variables:

https:/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L38

https:/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L43

These variables were originally named absolute and bs_absolute, which was misleading since the paths are not absolute on Windows, and I wonder if the reason for putting them in variables had been that clippy diagnostic. But with the new variable names looks_absolute and bs_looks_absolute, I think it is clearer even than the use of literal strings for them.

I think that diagnostic is only for literals. Commenting out the suppression on Windows does not seem to give any new cargo clippy warnings. So at least with clippy 0.1.80, this seems to have an affect only on non-Windows systems, since in the non-Windows tests, some such literals are used (for comparison to the Windows tests).

Because the tests are supposed to speak for themselves, I am reluctant to add a big commented explanation about these issues to the file. But if I can think of something that would improve clarity without making the code more cumbersome to read or work with, then I'll open another PR.

Indeed, these tests only exists to 'test' the std::path::Path implementation and get a feeling what happens in varying scenarios. Hopefully having written these down gives enough confidence that joining two relative paths won't accidentally create an absolute path, which could certainly be exploited.

Some relative paths would of course also be exploitable, such as if permitted paths could produce a relative path with a .. component in it, or a more dotted component like ... in it in a scenario where such a component performs upward traversal, or a relative path like C: or C:foo that specifies a drive, or a relative path like / or /foo or \ or \foo that is from the root of a drive, or a path that contains a symlink component that is used in a way that traverses symlinks, or almost any path coupled with the ability to create new hard links to existing files or (on Windows) other non-symlink reparse points such as junctions. I think we cover all that though, except the latter which we need not cover in paths because we never provide that ability.

For paths like .. as a tree with that name, #1374 covers that explicitly. For paths like ... I think we do not ever use them in a way that treats that specially on any operating system; they may also be directly blocked in some situations. For paths like C: and C:foo, I believe we are disallowing them since #1374 in the same way we disallow NTFS streams (alternate data streams, NTFS directory streams, etc.), which also use a : notation. For paths like / and /foo represented in the expected way as trees, we were already blocking that before CVE-2024-35186 was fixed in #1374. For paths like /, /foo, as well as \ and \foo on Windows, represented as single filenames, as well as x/../y and all its variations represented as single filenames, this was another aspect of CVE-2024-35186 that #1374 targeted explicitly.

With all that said, on Windows, you can join two relative paths to produce an absolute path, because C:, /, and \ are all relative paths on Windows, but C:/ and C:\ are absolute paths. Consider this program:

use std::path::Path;

fn main() {
    let drive = Path::new("C:");
    let slash = Path::new("/");
    let backslash = Path::new(r"\");
    let drive_slash = drive.join(slash);
    let drive_backslash = drive.join(backslash);

    for path in [drive, slash, backslash, drive_slash.as_path(), drive_backslash.as_path()] {
        let path_str = path.to_str().expect("valid Unicode");
        println!("{:3}  relative? {:5}  absolute? {}", path_str, path.is_relative(), path.is_absolute());
    }
}

On Windows, that outputs:

C:   relative? true   absolute? false
/    relative? true   absolute? false
\    relative? true   absolute? false
C:/  relative? false  absolute? true 
C:\  relative? false  absolute? true

Conceptually (and informally), what is happening is that C: is relative because only the drive it specifies is absolute, and / and \ are relative on Windows because only the drive-relative location they specify is absolute. So C:/ (or C:\) is absolute because the relative piece in each of C: and / (or \) has been resolved using corresponding absolute information from the other.

But because we are not allowing paths like C:, and not allowing paths like / or \, I think we are still okay.

@Byron
Copy link
Member

Byron commented Aug 24, 2024

Thanks for sharing! Paths are…fascinating :).

Because the tests are supposed to speak for themselves, I am reluctant to add a big commented explanation about these issues to the file. But if I can think of something that would improve clarity without making the code more cumbersome to read or work with, then I'll open another PR.

Since these tests are merely for documenting current behaviour, I think making it more obvious what they are testing (or what one should or could learn) is in their interest. Please feel free to open a PR with additional remarks, it's valuable to write it down while it's in active memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants