Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods to allow binary serialization (built-in pickle) #531

Merged
merged 16 commits into from
Sep 15, 2021

Conversation

hunterhector
Copy link
Member

@hunterhector hunterhector commented Sep 15, 2021

This PR fixes #530.

Description of changes

On the high level:

  1. Change the interface with options to change serialization using python's regular pickle.
  2. Fix the reader and writers, add options in the implementations to allow users to choose the pickle method.

A few details:

  1. Create from_string, to_string methods for data pack to handle string-like data pack serialization.
  2. Serialization/Deserialize functions now take file paths instead of the raw string content. Add a few serialization methods likezip_pack or serialize_method.
  3. Update the writers that use these functions.

Possible influences of this PR.

  1. Some implementation of writers/readers may need to be updated.

Test Conducted

Added new tests trying out a matrix of configurations of zipping and the pickle methods to use.

@codecov
Copy link

codecov bot commented Sep 15, 2021

Codecov Report

Merging #531 (3d29e1d) into master (7a6e6a3) will decrease coverage by 0.00%.
The diff coverage is 72.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #531      +/-   ##
==========================================
- Coverage   79.22%   79.22%   -0.01%     
==========================================
  Files         220      220              
  Lines       15530    15574      +44     
==========================================
+ Hits        12304    12338      +34     
- Misses       3226     3236      +10     
Impacted Files Coverage Δ
forte/data/readers/html_reader.py 65.88% <ø> (ø)
forte/processors/base/data_selector_for_da.py 45.76% <0.00%> (ø)
forte/processors/base/index_processor.py 47.16% <0.00%> (ø)
forte/processors/stave/stave_processor.py 100.00% <ø> (ø)
...ests/forte/data/readers/deserialize_reader_test.py 95.00% <ø> (ø)
forte/data/readers/deserialize_reader.py 65.55% <58.53%> (+6.29%) ⬆️
forte/processors/base/writers.py 69.87% <62.50%> (-3.16%) ⬇️
forte/data/base_reader.py 87.40% <66.66%> (ø)
forte/data/multi_pack.py 72.12% <66.66%> (+0.12%) ⬆️
forte/processors/writers.py 62.50% <66.66%> (+4.43%) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a6e6a3...3d29e1d. Read the comment docs.

@hunterhector hunterhector merged commit 8ee2c6c into asyml:master Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add binary serialization method
1 participant