Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match GNU semantics for missing EOF #4009

Merged
merged 1 commit into from
Oct 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 25 additions & 5 deletions src/uu/split/src/split.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,9 @@ static OPT_HEX_SUFFIXES: &str = "hex-suffixes";
static OPT_SUFFIX_LENGTH: &str = "suffix-length";
static OPT_DEFAULT_SUFFIX_LENGTH: &str = "0";
static OPT_VERBOSE: &str = "verbose";
//The ---io-blksize parameter is consumed and ignored.
//The ---io and ---io-blksize parameters are consumed and ignored.
//The parameter is included to make GNU coreutils tests pass.
static OPT_IO: &str = "-io";
static OPT_IO_BLKSIZE: &str = "-io-blksize";
static OPT_ELIDE_EMPTY_FILES: &str = "elide-empty-files";

Expand Down Expand Up @@ -159,6 +160,13 @@ pub fn uu_app<'a>() -> Command<'a> {
.long(OPT_VERBOSE)
.help("print a diagnostic just before each output file is opened"),
)
.arg(
Arg::new(OPT_IO)
.long(OPT_IO)
.alias(OPT_IO)
.takes_value(true)
.hide(true),
)
.arg(
Arg::new(OPT_IO_BLKSIZE)
.long(OPT_IO_BLKSIZE)
Expand Down Expand Up @@ -922,10 +930,22 @@ impl<'a> Write for LineBytesChunkWriter<'a> {
// then move on to the next chunk if necessary.
None => {
let end = self.num_bytes_remaining_in_current_chunk;
let num_bytes_written = self.inner.write(&buf[..end.min(buf.len())])?;
self.num_bytes_remaining_in_current_chunk -= num_bytes_written;
total_bytes_written += num_bytes_written;
buf = &buf[num_bytes_written..];

// This is ugly but here to match GNU behavior. If the input
// doesn't end with a \n, pretend that it does for handling
// the second to last segment chunk. See `line-bytes.sh`.
if end == buf.len()
&& self.num_bytes_remaining_in_current_chunk
< self.chunk_size.try_into().unwrap()
&& buf[buf.len() - 1] != b'\n'
{
self.num_bytes_remaining_in_current_chunk = 0;
} else {
let num_bytes_written = self.inner.write(&buf[..end.min(buf.len())])?;
self.num_bytes_remaining_in_current_chunk -= num_bytes_written;
total_bytes_written += num_bytes_written;
buf = &buf[num_bytes_written..];
}
}

// If there is a newline character and the line
Expand Down
16 changes: 16 additions & 0 deletions tests/by-util/test_split.rs
Original file line number Diff line number Diff line change
Expand Up @@ -659,6 +659,22 @@ fn test_line_bytes_no_empty_file() {
assert!(!at.plus("xak").exists());
}

#[test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused why this isn't testing the "-io" option too ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ---io option is unspecified behavior for GNU split. The code will currently accept and ignore any ---io flags. It is not clear how the flag should be handled as the "chunking" of data is very different.

fn test_line_bytes_no_eof() {
let (at, mut ucmd) = at_and_ucmd!();
ucmd.args(&["-C", "3"])
.pipe_in("1\n2222\n3\n4")
.succeeds()
.no_stdout()
.no_stderr();
assert_eq!(at.read("xaa"), "1\n");
assert_eq!(at.read("xab"), "222");
assert_eq!(at.read("xac"), "2\n");
assert_eq!(at.read("xad"), "3\n");
assert_eq!(at.read("xae"), "4");
assert!(!at.plus("xaf").exists());
}

#[test]
fn test_guard_input() {
let ts = TestScenario::new(util_name!());
Expand Down
4 changes: 4 additions & 0 deletions tests/fixtures/split/noeof.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
1
2222
3
4