-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZipArchive.Open will read non seakable stream to memory & fail with bad error on large streams #59027
Comments
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsDescription
AnalysisReason for this is here: runtime/src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchive.cs Line 144 in 00c38c7
When we pass a non seekable stream to However, this is very surprising and can cause both performance issues due to loading all contents to memory and unexpected failures if most of the data is < 2GB. I'm not sure if there is a way to fix this, given backward compact issues.
|
@carlossanlop this is something that we should consider for the .NET 7 Compression work as it's related to supporting large files (> 2GB) |
This is particularly bad when the stream being passed to the ZipArchive constructor is a 10-15 minute download and you want to be processing the contents as they become available. Please consider The first thing ZipArchive currently does with a seekable stream is runtime/src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchive.cs Lines 520 to 522 in 78593b9
|
https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT mentions streaming .zip files and says that every local file header must have an entry in the central directory. So that takes care of the deleted files concern and shows that streaming is a known use case.
|
To get myself unblocked, I created a proof of concept which successfully reads a streaming .zip file: https://gist.github.com/jnm2/31bdf08357a44c91d01736ad43b9c447 await using var reader = new StreamingZipReader(downloadStream);
while (await reader.MoveToNextEntryAsync(skipDirectories: true, CancellationToken.None))
{
Console.WriteLine($"{reader.CurrentEntry.Name}: {reader.CurrentEntry.Length} bytes");
using var stream = reader.GetCurrentEntryStream();
using var testReader = new StreamReader(stream);
var test = await testReader.ReadToEndAsync();
// (my test download had only text files, and they all looked right!)
} |
FWIW with permission from @jnm2 I've published StreamingZipReader as nuget https://www.nuget.org/packages/StreamingZipReader and recently fixed a bug with regard to ZIP64 support. |
Description
Analysis
Reason for this is here:
runtime/src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchive.cs
Line 144 in 00c38c7
When we pass a non seekable stream to
ZipArchive
- it will read it into aMemoryStream
. That will only work if the data can fit inMemoryStream
. I assume that this is because the Zip format requires seeking (directory, etc).However, this is very surprising and can cause both performance issues due to loading all contents to memory and unexpected failures if most of the data is < 2GB.
I'm not sure if there is a way to fix this, given backward compact issues.
The text was updated successfully, but these errors were encountered: