-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CompressionLevel.SmallestSize #41960
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
src/libraries/System.IO.Compression/src/System/IO/Compression/DeflateZLib/ZLibNative.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Compression/src/System/IO/Compression/CompressionLevel.cs
Outdated
Show resolved
Hide resolved
Can you please add a test(s) for this new value? I see our existing tests are not great. For deflate, we apparently don't test passing |
I'd want to ask how would I test it. Adding new items to |
Why does it need new items in testdata? The tests can compress any existing test data and decompress again. The goal is simply to check that it works with the specified value. |
If I read everything correctly, the unit test compresses the sample files, and compare them with precompressed result. If we add a new compression level, the compressed results should be added to testdata. |
Right, but given the limited coverage of these flags anyway simply decompressing again would have value and a future PR could update test data to verify the compressed result as well. I can certainly dig up instructions for updating it but figured it could be done separately |
There is coverage here and it needs to be updated: https:/dotnet/runtime/blob/master/src/libraries/Common/tests/System/IO/Compression/CompressionStreamUnitTestBase.cs#L344. Search for CompressionLevel in this directory and you should find multiple places to update. To update test-data submit a PR here https:/dotnet/runtime-assets/tree/master/src/System.IO.Compression.TestData/GZipTestData. However actually validating binary content of the compression algorithm leads to fragile tests. We don’t guarantee the results are binary identical as we call different zlib implementations on different platforms. I think it’s fair to assert in a test that the new enum does better on a set of payloads than other flags, since that is its promise. We should add a case to cover this. |
@ericstj would it make sense to iterate through all the
|
We could try - maybe a weaker ordering in that eg Optimal is no smaller than Smallest. BTW, since nobody is testing the algorithm implementations, I would imagine that you could simply compress a file generated in memory in some creative way (not too much entropy not too little..). |
Agree with @danmosemsft -- there's no claims about sizes between optimal and fastest. Those are statements about time it takes and balance of the parameters. SmallestSize is making a claim about size and we better back that up with tests. |
It sounds like we are testing the behavior of zlib which we don't own. |
@huoyaoyuan we have no interest in testing zlib. However we would like to know that “something” sensible happens when we are passed each of these settings. At a minimum that the operation succeeds. Ideally to also verify that we are passing the value through. The size test was an idea for doing that which hopefully would be both simple and stable in the face of any zlib change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ref changes look good. Needs tests as highlighted in other comments. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no claims about sizes between optimal and fastest.
I see what you mean, @ericstj. Maybe the tests could do something like this:
- Iterate through all the
CompressionLevel
enum values. - For each enum value, generate a compressed file containing a few text files.
- Verify the compressed files were created.
- Open the compressed files, open the text files inside them.
- Verify that the text files are not corrupted (their contents are the same as the original text files).
Right, where presumably nothing need hit the disk. |
@dotnet/dnceng is there a way to figure out why we got |
This returned false because the job (and therefore the execution of the "WaitForHelixJobCompletion" MSBuild task) was cancelled. It did not log an error because it had not encountered an error by the time this occurred. I'll also poke at your jobs to see if the timeout was unusual because 2.5 hours is a long time. |
Right, I'm just wondering how we might tell why it ran so slow (or hung) |
It's definitely something going catastrophically, machine-tearing-down bad with System.Threading.Tests in the windows.10.amd64.server19h1.es.open run but it's just so bad I haven't got anything useful to say about the whats and the whys yet. |
@danmosemsft the best guess I have to this is:
All sorts of funky workloads get run on these machines so I don't necessarily think System.Threading caused a memory access panic, it just was the work item running when something so catastrophic it broke python 3's memory management happened. |
How curious @MattGal . OK it sounds like watchful waiting to see whether we see that again unless you see some obvious step we should take now that would give more info next time. |
Test added for validation of the enum member. Does this PR still need more tests? Comparing sizes may be added as an enhancement. |
Co-authored-by: Dan Moseley <[email protected]>
…reamUnitTestBase.cs
3b71109
to
1b990d1
Compare
@danmosemsft, FYI, unless things have changed recently, re-running via devops doesn't rebase on top of changes that have come in since the last commit, so rerunning in that manner doesn't help to pick up fixes that have since been merged. I sync'd the change and pushed a rebase. |
@stephentoub ah yes -- does close/reopen rebase, though? (@MattGal ?) |
Sorry I don't know the answer here, someone else on @dnceng might. |
Let's see! |
OK, it's mergeable. @carlossanlop could you or someone please review? |
5188ab9
to
0476a85
Compare
0476a85
to
d3adc38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment for the unit test.
return mms.Length; | ||
} | ||
|
||
long noCompressionLength = await GetLengthAsync(CompressionLevel.NoCompression); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this comment:#41960 (comment)
We cannot guarantee size order. This test should instead just verify that the contents of the generated compressed files are not corrupt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danmosemsft any agreement please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem likely though that size levels increase monotonically -- that would otherwise break my expectations -- my suggestion is to keep the test as is, and if it breaks at some future point, we can just loosen the test then. does that seem reasonable @carlossanlop ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. @ericstj ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericstj is this OK to merge? |
/azp run runtime |
Azure Pipelines successfully started running 1 pipeline(s). |
Hello @ericstj! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Closes #1549 .
pal_zlib.c
passes it as an integer directly to zlib, so no additional change required?