Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After deserializing from serialized complex objects containing a dictionary (such as MultiPack that uses '_inverse_pack_ref') the KEY of the dictionary becomes string (from previous int for example) #595

Closed
J007X opened this issue Jan 17, 2022 · 0 comments · Fixed by #576
Assignees

Comments

@J007X
Copy link
Collaborator

J007X commented Jan 17, 2022

Describe the bug
After deserializing from serialized complex objects containing a dictionary (such as MultiPack that uses '_inverse_pack_ref') the KEY of the dictionary becomes string (from previous int for example). For example in MultiPack. the dict '_inverse_pack_ref'
should be [int, int] but after deserialization (by JSONPickle) it becomes [str, int] and any attempt to using an int (such as pack_id) as key to lookup in such dictionary will generate an KeyError exception instead of the correct result, so currently this can only be workaround as the following addtional string conversion and catch additional errors (which is not ideal):

def get_pack_index(self, pack_id: int) -> int:
    try:
        return self._inverse_pack_ref[pack_id]
    except KeyError:  # sometimes after deserialization the dict have string as key
        s_dict: dict = self._inverse_pack_ref
        s_key = str(pack_id)
        try:
            return s_dict[s_key]
        except KeyError as ke:  # really not found
            raise ProcessExecutionException(
                f"Pack {pack_id} is not in this multi-pack."
            ) from ke

To Reproduce
the 2 test cases of MultiEntryStructure in entry_data_structures_test.py can be used.
Basically serialize a MultiPack with at least 2 data_packs in it to a string , then deserialize from the string, and re-link the data_packs. This will generate an exception with the new changes that refer/accessing the data_packs in a multi-pack by pack_id instead of by internal array index (#559 related enhancements). So an old version of "get_pack_index" in multi-pack not handling the int -> str change of the dictionary will fail to lookup for the pack correctly.

Expected behavior
Such dictionary will be unchanged after deserialization

Screenshots

Environment (please complete the following information):
(non-specific)

Additional context
Per discussion with Hector, this seems to relate to the following discussion (when serialized to JSON):
https://stackoverflow.com/questions/1450957/pythons-json-module-converts-int-dictionary-keys-to-strings

@J007X J007X self-assigned this Jan 28, 2022
J007X added a commit to J007X/forte that referenced this issue Jan 28, 2022
…fixing , together with asyml#559 related version (DEFAULT to default_PACK_VERSION)
J007X added a commit to J007X/forte that referenced this issue Jan 28, 2022
…fixing , adding # pylint: disable=W0212 for changing _inverse_pack_ref dictionary
J007X added a commit to J007X/forte that referenced this issue Jan 28, 2022
J007X added a commit to J007X/forte that referenced this issue Feb 1, 2022
hunterhector added a commit that referenced this issue Mar 4, 2022
…om (array) index based to pack_id based (to facilitate delete, insert), with version-enabled backward compatibility (#576)

* Inital fix for bug/enhancement 559

* Temp merge of version from 547 -- backward compatibility version checked added (new version as [0,0,1])

* Temp merge of version from 547 -- backward compatibility version checked added (new version as [0,0,1])

* Version comparison method added /fixed the problematic inline comparison logic of yesterday.

* fixed black format issue

* fixed lint comment format issue

* fixed lint return issue

* Inital checkin for pack version enabled backward-compatible arrary index to pack_id change for multiPack, with new unit test case

* fixed multi_pack pack_id related issue and add code for handling deserialization dictionary using string as key issue.

* fixed black format issue

* fixed unused exception var issue by lint

* #595 related (dictionary key type change after deserialization) fixing , together with #559 related version (DEFAULT to default_PACK_VERSION)

* #595 related (dictionary key type change after deserialization) fixing , adding # pylint: disable=W0212  for changing _inverse_pack_ref  dictionary

* #595 related -- fix black format issue

* Fixed type issue reported by pylint

* fix pylint incompatible type (in line688) and also change the pylint directive to explict (from W0212)

* fixed black format issue

* type conversion related changes for from_string (as type conversion limited by pylint)

* remove Unused import DEFAULT_PACK_VERSION reported by lint

* fix comment line too long issue (689).

* added pylint: disable=no-member  for fixing of #595 for line 694 and 695.

* fix attribute issue (adding assert)

* change previous type conversion code to be the same implementation as base_pack (plus additional fix code) without calling super().from_string, as type conversion from base_pack to multi_pack is indeed problematic and not strict (and thus not recommended)

* fixed black format issue

* pylint does not allow local import so change to the top.

* fix location of import required by lint

* fix import order complaint by pylint

* change version to standard string format and related exception handling fixings

* fix import sequence

* fix import sequence by lint

* #559 related test case and comments changes, and backward compatible version variable name change.

* comments are modified according to comments/requests in PR, also the requirements.txt is modified to include packaging as required.

* Fix spelling issue in comments

* Update docstring of get_subentry() in multi_pack.py

* Update docstring of get_subentry

Co-authored-by: Hector <[email protected]>
Co-authored-by: mylibrar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant