Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread Safety #72

Open
pascalreinhold opened this issue Aug 3, 2023 · 3 comments
Open

Thread Safety #72

pascalreinhold opened this issue Aug 3, 2023 · 3 comments
Assignees

Comments

@pascalreinhold
Copy link

Hello there,

is it possible to read multiple large structs (1-3 GB) from a MAT-file concurrently?
I found nothing on this github page regarding thread safety.

If it is not supported out of the box then how would one go about it?

@S-Dafarra
Copy link
Member

Hi @pascalreinhold, thanks for opening the issue!

matio-cpp is a cpp interface toward the matio library, that takes care of dealing with the mat file. When opening a mat file, matio loads its entire content in memory. Hence, when reading and writing variables, it always accesses the same portion of memory. Thus, there are possible concurrency issues, and by extension, also matio-cpp is not thread-safe.

If your goal is to speed up the reading of the file, I would suggest splitting it in separate files, or to use a format that supports reading in chunks like hdf5 (see for example https://docs.hdfgroup.org/hdf5/v1_12/group___h5_d.html#gac1092a63b718ec949d6539590a914b60). Recent mat files are compatible with hdf5, but mat files on their own do not support this option unfortunately.

@pascalreinhold
Copy link
Author

Hey thank you for the fast reply.

Does this is also apply to me, because I'm just reading the file and not writing?

Hence, when reading and writing variables [...] there are possible concurrency issues, and by extension, also matio-cpp is not thread-safe.

Not sure, but I think you are mistaken. In matio there are the Mat_VarReadInfo() and Mat_VarRead() functions to avoid loading a variable into memory until you need it.

When opening a 'mat' file, 'matio' loads its entire content in memory

@S-Dafarra
Copy link
Member

Not sure, but I think you are mistaken. In matio there are the Mat_VarReadInfo() and Mat_VarRead() functions to avoid loading a variable into memory until you need it.

When opening a 'mat' file, 'matio' loads its entire content in memory

Both those function require opening the mat file first, i.e. loading it into memory. See:

Btw, Mat_VarRead is the exact function that matio-cpp uses to read a variable:

matvar_t *matVar = Mat_VarRead(m_pimpl->mat_ptr, name.c_str());

Note that Mat_VarRead requires a non-const pointer to a mat_t object. This means that even the read can potentially modify this object. Hence, there could be possible concurrent reads and writes. So to answer your question,

Does this is also apply to me, because I'm just reading the file and not writing?

unfortunately, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants