Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow handling large external buffers #396

Merged
merged 5 commits into from
Dec 13, 2023

Conversation

ShaddyDC
Copy link
Contributor

Closes #393

I've added a new error type for it to make failures explicit, but you could arguably also just use as usize and live with potentially weird behaviour when large values are truncated. Presumably, in those cases you'd crash anyway when trying to load the data into memory, though I haven't looked into it more.

I've also silenced some warnings in a separate commit by hiding them behind a cfg tag because they annoyed me during development, but I'm happy to remove those changes if you prefer.

examples/export/main.rs Outdated Show resolved Hide resolved
let start = view.offset();
let end = start + view.length();
let start = usize::try_from(view.offset()).ok()?;
let end = usize::try_from(start as u64 + view.length()).ok()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to fail on 32bit systems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will fail on 32bit systems when the buffer is too large to fit into the 32bit address space, which seems fair to me. I guess technically the part of the buffer to be accessed in this function specifically may fit, but I'm not sure that edge case is worth handling. Would you like to handle it differently?

src/buffer.rs Outdated
@@ -91,8 +92,8 @@ impl<'a> Buffer<'a> {
}

/// The length of the buffer in bytes.
pub fn length(&self) -> usize {
self.json.byte_length as usize
pub fn length(&self) -> u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just going to lead to casts every time it is used. I think usize was the right choice. Buffers exceeding u32::MAX length become problematic on 32bit systems. I don't think you can even address a slice in Rust beyond u32::MAX on 32bit systems.

I agree that internally, the length should be stored as u64 but I think the offsets and lengths should be reported as usize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading the gltf file should fail on 32bit systems if there are buffers with lengths larger than u32::MAX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I figured it's worth preserving the actual information losslessly as long as possible so users of the interface can decide how they want to deal with it. Do you want to truncate the information here by casting to usize, which may lead to unexpected behaviour, or should this function be fallible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An application might want to be able to access a glTF that is too big to fit into memory …

  • as a viewer or editor that has a use for showing metadata to the user even without rendering;
  • because they will be loading only selected objects from the glTF file (e.g. the scene the user selects, or selected “layers” of a complex dataset visualization; or
  • to process a large buffer in a streaming fashion.

Therefore, it's not necessarily the case that loading a glTF containing buffers that won't fit into memory is useless. Note that if a user wants a usize they can call usize::try_from() on the u64 and the result is precisely either a usize or an error that they can .unwrap() or ? to fail on. (But perhaps there's an even better API possible; I haven't looked at the bigger picture.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aloucks Do you have an update on how you would prefer this to be handled? It sounds like you'd like to fail when loading a glTF file that has any buffer that is too large and then just hard cast to usize here later? The use case of working with larger files on 32 bit systems is not worth supporting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to settle on usize as 32 bit systems are the exception rather than the norm nowadays. The WASM runtime however is a popular target for this library and 64 bit addressing is unstable. For this reason, I think it's important the user can handle this scenario gracefully somehow.

For the majority of users running on 64 bit systems, the cast from u32 to usize is already a bit annoying, so I suggest few things:

  1. Create a new type struct Address(u64) (the name is debatable) in gltf-json.
  2. Implement Validate for Address and report an error on 32 bit systems if it exceeds the usize range.
  3. In the main/wrapper crate, change the return type for existing u32 offsets/sizes to usize, using as to cast internally.

This should cover all of the points mentioned previously. 32 bit systems will still be able to process large glTF (and the wrapper via. from_json_without_validation) or report to the user the out of range size otherwise.

@ShaddyDC, would you like to implement this yourself? I can chip in and push to this branch if you like otherwise.

Copy link
Contributor Author

@ShaddyDC ShaddyDC Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good way to go about it. I can take a shot at an implementation, but if you've got the time to do it, I'd appreciate if you could take care of it. I'm not yet familiar with the validation implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it with me 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened a PR on your fork: ShaddyDC#1

Copy link
Contributor Author

@ShaddyDC ShaddyDC Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, merged! Though I think you should also have commit access

@alteous alteous merged commit ec6fc16 into gltf-rs:main Dec 13, 2023
1 check passed
@ShaddyDC ShaddyDC deleted the fix-large-buffers branch December 13, 2023 15:30
@ShaddyDC
Copy link
Contributor Author

@alteous thanks for merging this! When are you planning to do the next release?

@alteous
Copy link
Member

alteous commented Dec 15, 2023

Likely today or this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panics on reading a large GLTF due to the .bin being more than ~4GB and byteLength using a u32
4 participants