Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLB file size is limited to 2^32 bytes because of uint32 length #2114

Open
cansik opened this issue Jan 30, 2022 · 11 comments
Open

GLB file size is limited to 2^32 bytes because of uint32 length #2114

cansik opened this issue Jan 30, 2022 · 11 comments

Comments

@cansik
Copy link

cansik commented Jan 30, 2022

The GLB file size is currently limited to 2^32 bytes (approximately 4 GB) which could be limiting for some applications. For example we would like to store a mesh sequence into the GLTF format, which quickly exceed the 2^32 byte limit.

It is understandable to limit the individual chunk size to 2^32, but limiting the total file size to 2^32 seems to restrict future applications. Is there a technical limitation to only support datatypes with size of an unsigned int? Or is there a workaround for this limitation that integrates everything into a single file?

Here the current binary protocol definition of a GLB file:
glbbinary
Source: Binary glTF Layout

@lexaknyazev
Copy link
Member

The current GLB format has several well-known shortcomings and the 4 GiB size limitation is one of them. This will likely be addressed in the future container format revisions, be it GLBv3 or something completely new.

/cc @emackey @javagl

@emackey
Copy link
Member

emackey commented Jan 30, 2022

@cansik Is there a benefit to storing > 4GB in a single file? It's a heavy lift in that form.

One possibility is to use *.gltf (not GLB) with multiple buffers, such that each buffer can be a more readable size.

Some systems such as 3D Tiles break up their dataset into a hierarchy, with lots of self-contained GLBs storing sections of the data, along with a folder system that makes it easy to find and download-on-demand a couple of small GLBs representing what the user is looking at in the moment.

@javagl
Copy link
Contributor

javagl commented Jan 30, 2022

The limitation itself might be a tribute to glTFs origin as a transmission format: A >4GB file will likely not be transferred to the client as a single, huge blob. So I'd hesitate to say that glTF is the right "layer" to address this, but might be convinced otherwise. In any case, this cannot easily be changed in a backward-compatible way - except for defining the GLB version to be 3, and store the length somewhere else.

A container format could alleviate the problem. And as long as there is no dedicated container format above glTF, there could even be very use-case specific solutions. A "mesh sequence" could be many things (and it sounds like something of which clients will usually not know what to do with it, unless there is an extension for this), but maybe some JSON file with { meshSequence: [ "0.glb", "1.glb"...] } could be a first shot...

An aside:

One possibility is to use *.gltf (not GLB)

I also prefer the idea of a GLB being "self-contained", but I think it is not disallowed by the spec for a GLB to refer to external resources (and cannot remember that it has been explicitly discouraged anywhere - maybe somewhere hidden in #1117 ...?).

@donmccurdy
Copy link
Contributor

I also prefer the idea of a GLB being "self-contained", but ... cannot remember that it has been explicitly discouraged anywhere

I'd summarize #1117 as: public user-facing tools should try to provide .gltf with external resources AND/OR .glb with embedded resources. Other arrangements — .gltf with embedded resources or .glb with external resources — are allowed and valid where appropriate for a project, but I think it's better for most users if these don't proliferate in the 3D ecosystem.

@bghgary
Copy link
Contributor

bghgary commented Feb 7, 2022

FWIW, I asked about self-contained glb a while back and there is a valid reason for using external resources with a glb for servers.

Question: #828 (comment)
Response: #828 (comment)

@vpenades
Copy link
Contributor

vpenades commented Dec 8, 2023

As a workaround, I would suggest using zip as a plain container, and maybe the extension .GLZ could be used to distinguish it from GLB.

This would allow limitless file sizes, and clients and tools would easily adopt it. Worst case scenario could be resolved by unzipping to a directory

The recommendation would be, to keep using GLB for transmission scenarios and GLZ for large files, backup and intermediate workflows

@javagl
Copy link
Contributor

javagl commented Dec 8, 2023

A plain ZIP (as an archive) could have some issues because it does not allow random access. There are approaches for extending ZIP files with sorts of "indices" (basically: A file that is always stored as the last entry in the ZIP, and stores a mapping of "file name" to "byte offset in the ZIP"), but no real 'standard' for that, as far as I know.

@vpenades
Copy link
Contributor

vpenades commented Dec 8, 2023

it does not allow random access

That's not completely true. the USDZ format, which is a glTF competitor, uses a ZIP file with the restriction of forbidding file compression, which results in a plain file with a TOC and randomly accessible files.

Krita .KRA files, and OpenRaster .ORA use a similar approach: the entries in the ZIP must be stored uncompressed to allow random access. So to some degree, we could say that uncompressed ZIPs are becoming a thing.

But it all depends on how are you going to consume the files. My SharpGLTF library already supports zipped glTFs and I don't see any problem handling compressed zips)

@javagl
Copy link
Contributor

javagl commented Dec 8, 2023

a ZIP file with the restriction of forbidding file compression, which results in a plain file with a TOC and randomly accessible files.

Yes, this could be an option. There's still the small caveat that on the consuming side, people will usually have to invest some effort there: I think that most "ZIP libraries" (as 'common libraries for zip handling in different programming languages') tend to hide that as an abstraction, and only offer functionalities like "iterating over all entries", or "looking up a certain entry (with a linear search under the hood)".

More technically/specifially: These libraries do not necessarily provide the low-level access to the ZIP central directory that would allow constant-time lookups.

(All this does not prevent this approach, but should be kept in mind when making this an integral part of a specification)

@vpenades
Copy link
Contributor

vpenades commented Dec 8, 2023

ZIP file format has a table of contents with offsets and sizes that is read before accessing the rest of the content. I know a few ZIP libraries and all of them give you random access to the contents, even when compressed.

In fact, random access is available in most archive formats, the only ones not supporting random access are Tar.GZ and Rar/7z when compressed as solid archives.

@vpenades
Copy link
Contributor

In addition to my proposal of a zipped version, I had the opportunity to tinker with OpenRaster and some other Zip based documents recently, so, I wanted to ask:

Would it be fine to open a new issue with a proposal for GLZ ? or it's better to keep discussion here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants