Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ontologies for audio data #592

Merged
merged 8 commits into from
Jan 20, 2022
Merged

Conversation

mylibrar
Copy link
Collaborator

This PR fixes #591 .

Description of changes

New top level entry: AudioAnnotation

  • It's identical to the implementation of Annotation except that it replaces text property with audio.
  • DataPack is updated accordingly to support add/update/query/delete operations on audio annotations.

New ontologies in base_ontology: Recording, AudioAnnotation

  • Update the base_ontology.json and regenerate base_ontology.py

Test Conducted

A unit test audio_annotation_test is added to test all the new ontologies added.

Suqi Sun and others added 3 commits January 14, 2022 14:24
* added static methods to infer pack type

* added unit tests for datapack type inference.

* fixed issue#558

* fixed black: forte/data/selector.py

* fixed import-outside-toplevel

* WIP: waiting for ndarray supported

* Ndarray -> ndarray

* values -> value

* added ndarray to SUPPORTED_PRIMITIVES

* WIP: generator accepts NdArray attribute

* remove breakpoints

* fixed None -> "None"

* WIP: designing NdArrayProperty

* WIP: ndarray_size -> ndarray_shape

* fixed lint

* fixed lint

* added ndarray property test cases

* removed irrelevant onto spec

* fixed lint

* fixed lint

* fixed black

* fixed description

* added FNdArray, a wrapper class for NdArray metric

* handle None dtype

* removed Optional typing

* fixed description

* added unit tests for ndarray attribute

* fixed black

* added doc string and more tests

* fixed type

* added reference to np.ndarray

* added unit tests for ndarray attribute against dtype, shape, and warning

Co-authored-by: Zhanyuan Zhang <[email protected]>
@codecov
Copy link

codecov bot commented Jan 14, 2022

Codecov Report

Merging #592 (4849a6b) into master (23bc505) will increase coverage by 0.24%.
The diff coverage is 92.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #592      +/-   ##
==========================================
+ Coverage   80.07%   80.32%   +0.24%     
==========================================
  Files         234      235       +1     
  Lines       16538    16748     +210     
==========================================
+ Hits        13243    13452     +209     
- Misses       3295     3296       +1     
Impacted Files Coverage Δ
forte/data/readers/audio_reader.py 89.28% <71.42%> (+1.28%) ⬆️
forte/data/ontology/top.py 84.23% <85.71%> (+0.81%) ⬆️
forte/data/data_pack.py 81.66% <90.54%> (+3.38%) ⬆️
tests/forte/data/audio_annotation_test.py 98.75% <98.75%> (ø)
ft/onto/base_ontology.py 95.51% <100.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23bc505...4849a6b. Read the comment docs.

forte/data/data_pack.py Show resolved Hide resolved
forte/data/data_pack.py Show resolved Hide resolved
forte/data/ontology/top.py Outdated Show resolved Hide resolved
tests/forte/data/audio_annotation_test.py Show resolved Hide resolved
tests/forte/data/audio_annotation_test.py Show resolved Hide resolved
forte/data/data_pack.py Show resolved Hide resolved
forte/data/data_pack.py Show resolved Hide resolved
Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things we could follow up with this:

  1. Add a few test cases related to the newly updated condition branches, notice that test cases should cover most of these branches since bugs often happen in these less used branches. Check the codecov report if needed. Examples of not tested areas include the warning message, deleting AudioAnnotation, and other similar new conditional branches.

The other two items should have their own issues and tickets, could you create two issues based on them?

  1. We have generally covered the get function but haven't covered the get_data function. The differences between these two functions: get obtain the classes while get_data return them as primitive types and nested dict. get_data also can handle m more complex requests. It is worth creating an issue regarding this. Basically, we should have the get_data function also work but not simply raising NotImplementedError as Line935 in data_pack.py.
  2. The index and build_coverage_index functions should be polished to include AudioAnnotation as well. Related line is here:
    def build_coverage_for(

require_annotations(AudioAnnotation)
and isinstance(range_annotation, Annotation)
):
logger.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this conditional branch (i.e. the warning branch) tested?

forte/data/data_pack.py Show resolved Hide resolved
@hunterhector hunterhector merged commit 54e2e93 into asyml:master Jan 20, 2022
@mylibrar mylibrar deleted the localtest branch January 20, 2022 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ontologies for audio data
3 participants