Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update data pack version documentation and a multipack check bug #854

Merged
merged 2 commits into from
Jul 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions CHANGELOG.md

This file was deleted.

33 changes: 33 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,39 @@
=======
History
=======

0.3.0.dev2
— — — — — — — — -
* DataPack version is still 0.0.2 (unstable)
* Add a new tutorial for building machine translation system: #818, #826
* Fix issues in documentation and tutorials: #825, #799, #830
* Improve data augmentation: #784
* Data-efficiency improvement: #834, #839, #692, #842

0.3.0.dev1
— — — — — — — — -
* Unstable development version
* DataPack version is updated to 0.0.2 (unstable), does not support old data pack version.
* Data-efficiency improvement
- Use new data structures such as list/tuples store the data in order to optimize the speed of operations such as add, query, get (type, range, attribute), delete, etc.
#782, #796, #779, #801, #769, #771, #800, #680, #814
* A prototyped Computer Vision design and example #795, #813
* Regular bug fixes

0.2.0
— — — — — — — — -
* DataPack is newly versioned as 0.0.1, also supporting old (un-versioned) data pack versions
* Add functionalities to data augmentation (#532, #536, #543, #554, #619, #685, #717)
* Fix issues in examples and create some new ones (#545, #624, #529, #632, #708, #711)
* Improve doctoring and refactor documentation (#611, #633, #636, #642, #652, #653, #657, #668, #674, #686, #682, #723, #730, #724)
* Add audio support to DataPack (#585, #592, #600, #603, #609)
* Improve and fix issues in ontology system (#568, #575, #577, #521)
* Relax package requirements and move out dependencies (#705, #706, #707, #720, #760)
* Add readers and selectors (#535, #516, #539)
* Create some utilities for pipeline (#499, #690, #562)
* Provide more operations for DataPack and MultiPack (#531, #534, #555, #564, #553, #576)
* Several clean up and bug fixes (#541, #693, #695)

0.1.2
— — — — — — — — -
* Simplify the Configurable interface (#517)
Expand Down
23 changes: 13 additions & 10 deletions forte/data/multi_pack.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
)
from forte.data.types import DataRequest
from forte.utils import get_class, get_full_module_name
from forte.version import DEFAULT_PACK_VERSION, PACK_ID_COMPATIBLE_VERSION
from forte.version import DEFAULT_PACK_VERSION


logger = logging.getLogger(__name__)
Expand All @@ -55,6 +55,10 @@

MdRequest = Dict[Type[Union[MultiPackLink, MultiPackGroup]], Union[Dict, List]]

# Before this, version, data packs are indexed in multipack using the index,
# but afterwards, they are indexed by the pack id.
version_indexed_by_pack_id = "0.0.1"


class MultiPackMeta(BaseMeta):
r"""Meta information of a MultiPack."""
Expand Down Expand Up @@ -149,19 +153,18 @@ def _init_meta(self, pack_name: Optional[str] = None) -> MultiPackMeta:
def _validate(self, entry: EntryType) -> bool:
return isinstance(entry, MultiPackEntries)

# TODO: get_subentry maybe useless
def get_subentry(self, pack_idx: int, entry_id: int):
r"""
Get sub_entry from multi pack. This method uses `pack_id` (a unique
identifier assigned to datapack) to get a pack from multi pack,
and then return its sub_entry with entry_id. Noted this is changed from
the way of accessing such pack before the PACK_ID_COMPATIBLE_VERSION,
Get `sub_entry` from `multi pack`. This method uses `pack_id` (a unique
identifier assigned to datapack) to get a pack from `multi pack`,
and then return its sub_entry with entry_id.

Noted this is changed from the way of accessing such pack before v0.0.1,
in which the `pack_idx` was used as list index number to access/reference
a pack within the multi pack (and in this case then get the sub_entry).
a pack within the `multi pack` (and in this case then get the `sub_entry`).

Args:
pack_idx: The pack_id for the data_pack in the
multi pack.
pack_idx: The pack_id for the data_pack in the multi pack.
entry_id: the id for the entry from the pack with pack_id

Returns:
Expand All @@ -171,7 +174,7 @@ def get_subentry(self, pack_idx: int, entry_id: int):
pack_array_index: int = pack_idx # the old way
# the following check if the pack version is higher than the (backward)
# compatible version in which pack_idx is the pack_id not list index
if Version(self.pack_version) >= Version(PACK_ID_COMPATIBLE_VERSION):
if Version(self.pack_version) >= Version(version_indexed_by_pack_id):
pack_array_index = self.get_pack_index(
pack_idx
) # the new way: using pack_id instead of array index
Expand Down
2 changes: 2 additions & 0 deletions forte/version.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,7 @@
VERSION = "{0}.{1}.{2}".format(_MAJOR, _MINOR, _REVISION)
FORTE_IR_VERSION = "0.0.1"
PACK_VERSION = "0.0.2"
# The version before formal release, data pack without
# version annotation will be assigned this.
DEFAULT_PACK_VERSION = "0.0.0"
PACK_ID_COMPATIBLE_VERSION = "0.0.2"