Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Flickr30k #285

Merged
merged 325 commits into from
Jun 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
325 commits
Select commit Hold shift + click to select a range
08f97bb
max instances for debugging
jacob-morrison May 15, 2021
0081ed6
b
jacob-morrison May 15, 2021
44b2bc0
printing devices
jacob-morrison May 15, 2021
0a6381d
moving tensors?
jacob-morrison May 15, 2021
40dbbec
self
jacob-morrison May 15, 2021
f19da07
p
jacob-morrison May 15, 2021
d3f6bce
p
jacob-morrison May 15, 2021
8c02327
l
jacob-morrison May 15, 2021
103f84d
l
jacob-morrison May 15, 2021
cfcf9b5
fixing heap?
jacob-morrison May 15, 2021
060f486
stop logging and printing
jacob-morrison May 15, 2021
f747438
less prints
jacob-morrison May 15, 2021
f6a9e6e
printing devices
jacob-morrison May 15, 2021
f058a67
p
jacob-morrison May 15, 2021
d175076
.
jacob-morrison May 15, 2021
5191663
devices
jacob-morrison May 15, 2021
6124354
device
jacob-morrison May 15, 2021
fa4806a
test
jacob-morrison May 15, 2021
513da78
testing not sampling
jacob-morrison May 15, 2021
1b360fc
testing not using model again
jacob-morrison May 15, 2021
9c01d6b
test not moving tensors
jacob-morrison May 15, 2021
457ee58
not printing
jacob-morrison May 15, 2021
d1c9399
trying image subset
jacob-morrison May 15, 2021
9addf02
debugging model
jacob-morrison May 15, 2021
6b9ee65
going back to full (model is slow?)
jacob-morrison May 15, 2021
291fc55
right number of instances
jacob-morrison May 16, 2021
8f5e898
distribut
jacob-morrison May 16, 2021
84347ef
more potential hard negatives
jacob-morrison May 16, 2021
136232d
non-distributed
jacob-morrison May 16, 2021
5b9e5f8
distributed + adding another seen set
jacob-morrison May 16, 2021
b335f4f
fixing evaluation method
jacob-morrison May 18, 2021
ac0541f
format
jacob-morrison May 18, 2021
d4aa387
testing fixed eval
jacob-morrison May 18, 2021
d074413
fixing variable
jacob-morrison May 18, 2021
2fb3cf3
fixing training var
jacob-morrison May 18, 2021
bcfba03
testing new eval again
jacob-morrison May 18, 2021
efe1c34
fix
jacob-morrison May 18, 2021
db2636b
fix
jacob-morrison May 18, 2021
c22b534
fix?
jacob-morrison May 18, 2021
e475c1e
changing k to 5
jacob-morrison May 18, 2021
16ef45b
float
jacob-morrison May 18, 2021
3524f66
moving labels to gpu
jacob-morrison May 18, 2021
e34a9aa
long
jacob-morrison May 18, 2021
567d037
trying hopefully fixed loss function
jacob-morrison May 18, 2021
f26064d
fix
jacob-morrison May 18, 2021
08a9c21
testing out the whole thing
jacob-morrison May 18, 2021
aa034a5
setting max instances to debug in distributed
jacob-morrison May 18, 2021
eaa7dee
debug stuff
jacob-morrison May 18, 2021
5b8454f
fixing num images
jacob-morrison May 18, 2021
8f603fe
hopefully fixing the dataset reader in dist
jacob-morrison May 19, 2021
e107c8d
full data
jacob-morrison May 19, 2021
e384f98
testing out brand new changes
jacob-morrison May 20, 2021
c34aab3
deleting some old comments
jacob-morrison May 20, 2021
d1a490d
fixing validation bug
jacob-morrison May 20, 2021
cc63b0e
testing on 1 gpu for now
jacob-morrison May 20, 2021
f8481f7
feature cache broken?
jacob-morrison May 20, 2021
73622b9
switching to tensor fields and stuff
jacob-morrison May 20, 2021
0a656c3
fix
jacob-morrison May 20, 2021
fc4b79e
print device
jacob-morrison May 20, 2021
2433c2d
trying to not move the batch?
jacob-morrison May 20, 2021
ce0c7e4
moving small batches to cpu
jacob-morrison May 20, 2021
3102341
not printing device
jacob-morrison May 20, 2021
c975cb4
deleting old tensor?
jacob-morrison May 20, 2021
b0af06c
debug
jacob-morrison May 20, 2021
6dca963
printing memory allocation
jacob-morrison May 20, 2021
23c1816
moving tensor to cpu immediately?
jacob-morrison May 20, 2021
6ec1673
deleting batch?
jacob-morrison May 20, 2021
4710049
debug
jacob-morrison May 20, 2021
3aecf2f
debug
jacob-morrison May 20, 2021
bd212be
debug
jacob-morrison May 20, 2021
a0582f1
does this work?
jacob-morrison May 20, 2021
f19ccfb
switching to eval and no grad
jacob-morrison May 21, 2021
d58b790
fix
jacob-morrison May 21, 2021
02a0e6b
mask list
jacob-morrison May 21, 2021
069aa05
backbone roll
jacob-morrison May 22, 2021
bcf1e07
typo
jacob-morrison May 22, 2021
8e8e2d5
log
jacob-morrison May 22, 2021
1c7fd69
debug
jacob-morrison May 22, 2021
450f490
notes
jacob-morrison May 25, 2021
1d183c4
testing no grad
jacob-morrison May 26, 2021
9c1cf49
testing validation batch size of 1
jacob-morrison May 26, 2021
bcc51d0
bug
jacob-morrison May 26, 2021
59ce103
didn't have the right variable?
jacob-morrison May 26, 2021
b44136c
don't need to softmax?
jacob-morrison May 26, 2021
ccdd77c
trying flickr30k with 8 batch and dummy captions
jacob-morrison May 26, 2021
6db610e
full flickr?
jacob-morrison May 27, 2021
0dc06bf
batch size 1
jacob-morrison May 27, 2021
b5095ba
testing training batches for validation
jacob-morrison Jun 1, 2021
dc4521c
Testing out val stuff
jacob-morrison Jun 2, 2021
2779985
updating reader (test will fail for now)
jacob-morrison Jun 2, 2021
e22db05
debug statements to figure out why val isn't worki
jacob-morrison Jun 4, 2021
0070fba
testing if top images always have same scores
jacob-morrison Jun 8, 2021
d4883ff
getting rid of caption debugging step
jacob-morrison Jun 8, 2021
e6a4525
using the right caption var
jacob-morrison Jun 8, 2021
43c596b
updating reader to mirror vilbert training setup
jacob-morrison Jun 8, 2021
b6594c4
full dataset (dummy caption embeddings)
jacob-morrison Jun 8, 2021
23af680
switching to real caption embeddings
jacob-morrison Jun 8, 2021
19aec45
testing caching hard negatives
jacob-morrison Jun 9, 2021
6eb1a3c
log
jacob-morrison Jun 9, 2021
15c0834
limit instances to test caching
jacob-morrison Jun 9, 2021
cf9a813
delete faiss
jacob-morrison Jun 9, 2021
fc5afe3
more cache tests
jacob-morrison Jun 10, 2021
9a5938b
one more log statement
jacob-morrison Jun 10, 2021
8e022da
single epoch to calculate hard negatives
jacob-morrison Jun 10, 2021
31c1897
need to import logging
jacob-morrison Jun 10, 2021
f6b072d
don't log misses anymore (too slow)
jacob-morrison Jun 10, 2021
208ada0
using consistent hash function (test # instances)
jacob-morrison Jun 10, 2021
3a0577e
Flickr30k batching (#277)
dirkgr Jun 10, 2021
a601b63
test caching captions and hard negatives on full
jacob-morrison Jun 10, 2021
59a080c
don't log cache hits
jacob-morrison Jun 11, 2021
ff8900b
logging training labels to debug
jacob-morrison Jun 11, 2021
597536c
switching val to 4 way mc
jacob-morrison Jun 11, 2021
990ba88
can we overfit
jacob-morrison Jun 11, 2021
935524c
not 1k instances
jacob-morrison Jun 11, 2021
c172104
not logging + overfit
jacob-morrison Jun 11, 2021
47203f8
not overfit
jacob-morrison Jun 11, 2021
245fadc
even fewer instances
jacob-morrison Jun 11, 2021
706d059
all instances
jacob-morrison Jun 11, 2021
dfe93d1
even more overfitting
jacob-morrison Jun 11, 2021
045aa4c
back to normal
jacob-morrison Jun 11, 2021
4c6e85c
b
jacob-morrison Jun 11, 2021
0a2e85e
bkac to normal
jacob-morrison Jun 11, 2021
0cb3cc7
log loss and stuff again
jacob-morrison Jun 11, 2021
9e84845
reset
jacob-morrison Jun 11, 2021
6f78b7f
don't include hard negatives in case there's a bug
jacob-morrison Jun 11, 2021
7798182
batch size of 1
jacob-morrison Jun 11, 2021
5effcc5
more epochs
jacob-morrison Jun 11, 2021
a7601a8
only correct answer and hard negatives
jacob-morrison Jun 11, 2021
c4031b6
Cleanup
dirkgr Jun 11, 2021
d4d1efd
Fix error in caption caching
dirkgr Jun 11, 2021
9dffc27
Find hard negatives even when we don't have enough instances
dirkgr Jun 11, 2021
688f71d
O(1) algorithm for finding a random number with one exception
dirkgr Jun 11, 2021
3529cba
Make sure the wrong caption comes from a different image
dirkgr Jun 11, 2021
3b7b390
Merge remote-tracking branch 'origin/flickr30k' into flickr30k
dirkgr Jun 11, 2021
f8758b5
Cross entropy loss
dirkgr Jun 12, 2021
36fefa6
trying overfitting with full instances
jacob-morrison Jun 12, 2021
dca1ae8
use full dataset without learning rate scheduler
jacob-morrison Jun 14, 2021
f33323b
don't limit instances and don't log
jacob-morrison Jun 14, 2021
0bcda7c
batch size, scheduler, wandb
jacob-morrison Jun 14, 2021
fe32457
comment out wandb
jacob-morrison Jun 14, 2021
53266f3
full dataset no hard negatives
jacob-morrison Jun 14, 2021
41ea9a4
don't log loss
jacob-morrison Jun 14, 2021
be637ed
giving the correct answer a cheat word
jacob-morrison Jun 14, 2021
b536468
use local feature cache
jacob-morrison Jun 14, 2021
c4298b1
logging cache stuff
jacob-morrison Jun 14, 2021
934e3c0
different local feature cache dir
jacob-morrison Jun 14, 2021
c69688e
switching to cheat box
jacob-morrison Jun 14, 2021
272bece
bug
jacob-morrison Jun 15, 2021
e6deec5
something up with some boxes
jacob-morrison Jun 15, 2021
aa49f73
no cheating and no hard negatives
jacob-morrison Jun 15, 2021
21e88bb
seeing is a really big batch size works
jacob-morrison Jun 15, 2021
0117436
bug
jacob-morrison Jun 15, 2021
f8efae6
testing 64 bs
jacob-morrison Jun 15, 2021
6f5928d
batch size 32
jacob-morrison Jun 16, 2021
3e55f5f
batch size 48
jacob-morrison Jun 16, 2021
463bf76
full training with 32 batch size no hard negatives
jacob-morrison Jun 16, 2021
672a17a
more gradient accumulation steps
jacob-morrison Jun 16, 2021
dfa2de2
trying to train with 10% of the data
jacob-morrison Jun 16, 2021
7a23436
fix
jacob-morrison Jun 16, 2021
b561eb4
bumping up the learning rate, don't correct bias
jacob-morrison Jun 16, 2021
87c89ec
gradient accumulation + hard negatives
jacob-morrison Jun 16, 2021
0fd4516
use local feature cache
jacob-morrison Jun 16, 2021
53986a9
changing params back
jacob-morrison Jun 16, 2021
0b0f981
trying real validation
jacob-morrison Jun 16, 2021
613301a
no hard negatives
jacob-morrison Jun 16, 2021
70499f1
hard negatives and not real validation
jacob-morrison Jun 16, 2021
0f0265d
no hard negatives + real validation
jacob-morrison Jun 16, 2021
444d720
calc hn
jacob-morrison Jun 17, 2021
46c463c
Merge branch 'main' into flickr30k
jacob-morrison Jun 17, 2021
48a0733
fixing predictors
jacob-morrison Jun 17, 2021
23226bc
fix
jacob-morrison Jun 17, 2021
22b1b21
fix
jacob-morrison Jun 17, 2021
4b37a5a
fix
jacob-morrison Jun 17, 2021
cfd49ef
fix
jacob-morrison Jun 17, 2021
6384cc0
cleaning up PR (in progress)
jacob-morrison Jun 17, 2021
3cee710
cleaning things up
jacob-morrison Jun 17, 2021
e9f8fc3
more cleanup
jacob-morrison Jun 17, 2021
a80c3b5
change warmup steps
jacob-morrison Jun 17, 2021
477c08a
only validate every ~5 epochs
jacob-morrison Jun 18, 2021
98621ce
printing shapes
jacob-morrison Jun 18, 2021
aa1d898
more logging
jacob-morrison Jun 18, 2021
0e6d474
fix log
jacob-morrison Jun 18, 2021
9acfbca
try cat instead of stack
jacob-morrison Jun 18, 2021
a479ac6
different logging
jacob-morrison Jun 18, 2021
9a49477
test
jacob-morrison Jun 18, 2021
60f0b40
fix
jacob-morrison Jun 18, 2021
828941c
try batches per epoch
jacob-morrison Jun 18, 2021
0ed2543
bug
jacob-morrison Jun 18, 2021
5188f62
get rid of log statement
jacob-morrison Jun 18, 2021
dae7a6d
use local feature cache
jacob-morrison Jun 18, 2021
2149c9e
log
jacob-morrison Jun 18, 2021
5fbf701
logging cache miss
jacob-morrison Jun 18, 2021
acee3e0
switching back to old captions to use cache
jacob-morrison Jun 18, 2021
2111b23
switching back to preprocesing captions
jacob-morrison Jun 18, 2021
33be300
using nfs
jacob-morrison Jun 18, 2021
50ca0f1
Disabling hard negatives to test epoch strat
jacob-morrison Jun 18, 2021
e89fc12
not logging cache misses
jacob-morrison Jun 18, 2021
d8df10a
write to local cache (faster)
jacob-morrison Jun 18, 2021
4ac2969
epoch multiplier
jacob-morrison Jun 18, 2021
08e8ec5
no hard negatives
jacob-morrison Jun 18, 2021
4964b5a
hard negatives
jacob-morrison Jun 18, 2021
3f15731
lowering number of warmup steps
jacob-morrison Jun 21, 2021
a633273
no hard negatives
jacob-morrison Jun 21, 2021
4e6f908
hard negatives
jacob-morrison Jun 21, 2021
910363d
no hard negatives
jacob-morrison Jun 21, 2021
ffda296
hard negatives
jacob-morrison Jun 21, 2021
d78db79
Trying Jiasen's featurizer (1x epoch mult)
jacob-morrison Jun 22, 2021
02c12ae
null image stuff
jacob-morrison Jun 22, 2021
9e2eb35
null image
jacob-morrison Jun 22, 2021
61c4e8e
don't featurize captions (no hn)
jacob-morrison Jun 22, 2021
8a6bb98
adding vilbert ir model tests
jacob-morrison Jun 22, 2021
1d09d3e
cleanup + test distributed
jacob-morrison Jun 23, 2021
4fe0504
cleanup + dist
jacob-morrison Jun 23, 2021
4e97d15
test distributed
jacob-morrison Jun 23, 2021
906abbb
don't use shard_iterable
jacob-morrison Jun 23, 2021
95dc84d
fix feature dir
jacob-morrison Jun 23, 2021
28aa075
changelog
jacob-morrison Jun 23, 2021
b051df1
reformat
jacob-morrison Jun 23, 2021
811ece9
log shapes
jacob-morrison Jun 23, 2021
9abf2d8
Merge branch 'main' into flickr30k
jacob-morrison Jun 23, 2021
9fbf419
removing unused vars
jacob-morrison Jun 23, 2021
fe015e9
Merge branch 'flickr30k' of https:/allenai/allennlp-model…
jacob-morrison Jun 23, 2021
3e7d5f9
using old features
jacob-morrison Jun 23, 2021
9d80319
style
jacob-morrison Jun 23, 2021
6a9687f
lint
jacob-morrison Jun 23, 2021
2e39017
lint
jacob-morrison Jun 23, 2021
0b30dc7
don't log shapes
jacob-morrison Jun 23, 2021
5fb3cd3
lint
jacob-morrison Jun 23, 2021
e6a82a5
fixing type
jacob-morrison Jun 23, 2021
7e7e594
debug
jacob-morrison Jun 23, 2021
0e6f6bd
changing test files to hopefully fix test
jacob-morrison Jun 23, 2021
2064842
using cloud link for data dir
jacob-morrison Jun 23, 2021
7aa4cd8
cleanup
jacob-morrison Jun 23, 2021
6ad1608
delete print
jacob-morrison Jun 24, 2021
c7750fc
comment
jacob-morrison Jun 24, 2021
e3e7dd9
cleanup
jacob-morrison Jun 24, 2021
08d728c
fixing test assert
jacob-morrison Jun 24, 2021
543c9cf
committing a bunch of fixes
jacob-morrison Jun 24, 2021
3e13848
not distributed
jacob-morrison Jun 24, 2021
a6e441f
fixing metrics
jacob-morrison Jun 24, 2021
484249e
Adding test files + upping max instances
jacob-morrison Jun 25, 2021
2abc947
fixes
jacob-morrison Jun 25, 2021
74a31d2
Switching back to nfs cache
jacob-morrison Jun 25, 2021
4fab969
renaming n
jacob-morrison Jun 25, 2021
0d96ecf
update comment
jacob-morrison Jun 25, 2021
3dcc086
Merge branch 'main' into flickr30k
jacob-morrison Jun 25, 2021
83b7f28
fix
jacob-morrison Jun 25, 2021
7b9f9a6
Merge branch 'flickr30k' of https:/allenai/allennlp-model…
jacob-morrison Jun 25, 2021
6d27686
making test deterministic?
jacob-morrison Jun 25, 2021
a248158
sorting files to hopefully achieve consistency
jacob-morrison Jun 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added `StanfordSentimentTreeBankDatasetReader.apply_token_indexers()` to add token_indexers rather than in `text_to_instance`
- Added `AdversarialBiasMitigator` tests.
- Added `adversarial-binary-gender-bias-mitigated-roberta-snli` model.
- Added support for Flickr30k image retrieval, including a dataset reader, a model, and a training config.

### Fixed

Expand Down
1 change: 1 addition & 0 deletions allennlp_models/vision/dataset_readers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
from allennlp_models.vision.dataset_readers.vgqa import VGQAReader
from allennlp_models.vision.dataset_readers.vqav2 import VQAv2Reader
from allennlp_models.vision.dataset_readers.visual_entailment import VisualEntailmentReader
from allennlp_models.vision.dataset_readers.flickr30k import Flickr30kReader
480 changes: 480 additions & 0 deletions allennlp_models/vision/dataset_readers/flickr30k.py

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions allennlp_models/vision/dataset_readers/vision_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,13 @@ def __init__(
max_instances: Optional[int] = None,
image_processing_batch_size: int = 8,
write_to_cache: bool = True,
manual_distributed_sharding: bool = True,
manual_multiprocess_sharding: bool = True,
) -> None:
super().__init__(
max_instances=max_instances,
manual_distributed_sharding=True,
manual_multiprocess_sharding=True,
manual_distributed_sharding=manual_distributed_sharding,
manual_multiprocess_sharding=manual_multiprocess_sharding,
jacob-morrison marked this conversation as resolved.
Show resolved Hide resolved
)

# tokenizers and indexers
Expand Down
1 change: 1 addition & 0 deletions allennlp_models/vision/models/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from allennlp_models.vision.models.nlvr2 import Nlvr2Model
from allennlp_models.vision.models.vision_text_model import VisionTextModel
from allennlp_models.vision.models.visual_entailment import VisualEntailmentModel
from allennlp_models.vision.models.vilbert_image_retrieval import ImageRetrievalVilbert
from allennlp_models.vision.models.vilbert_vqa import VqaVilbert
from allennlp_models.vision.models.heads.vqa_head import VqaHead
from allennlp_models.vision.models.heads.visual_entailment_head import VisualEntailmentHead
138 changes: 138 additions & 0 deletions allennlp_models/vision/models/vilbert_image_retrieval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
import logging
from typing import Dict

from overrides import overrides
import torch

from allennlp.data import TextFieldTensors, Vocabulary
from allennlp.models.model import Model
from allennlp.modules.transformer import (
TransformerEmbeddings,
ImageFeatureEmbeddings,
BiModalEncoder,
)
from allennlp.training.metrics import CategoricalAccuracy
from torch.nn import CrossEntropyLoss

from allennlp_models.vision.models.vision_text_model import VisionTextModel

logger = logging.getLogger(__name__)


@Model.register("vilbert_ir")
@Model.register("vilbert_ir_from_huggingface", constructor="from_huggingface_model_name")
class ImageRetrievalVilbert(VisionTextModel):
"""
Model for image retrieval task based on the VilBERT paper.

# Parameters

vocab : `Vocabulary`
text_embeddings : `TransformerEmbeddings`
image_embeddings : `ImageFeatureEmbeddings`
encoder : `BiModalEncoder`
pooled_output_dim : `int`
fusion_method : `str`, optional (default = `"mul"`)
dropout : `float`, optional (default = `0.1`)
label_namespace : `str`, optional (default = `answers`)
k: `int`, optional (default = `1`)
"""

def __init__(
self,
vocab: Vocabulary,
text_embeddings: TransformerEmbeddings,
image_embeddings: ImageFeatureEmbeddings,
encoder: BiModalEncoder,
pooled_output_dim: int,
fusion_method: str = "mul",
dropout: float = 0.1,
k: int = 1,
*,
ignore_text: bool = False,
ignore_image: bool = False,
) -> None:
super().__init__(
vocab,
text_embeddings,
image_embeddings,
encoder,
pooled_output_dim,
fusion_method,
dropout,
is_multilabel=False,
ignore_text=ignore_text,
ignore_image=ignore_image,
)
self.classifier = torch.nn.Linear(pooled_output_dim, 1)

self.top_1_acc = CategoricalAccuracy()
self.top_5_acc = CategoricalAccuracy(top_k=5)
self.top_10_acc = CategoricalAccuracy(top_k=10)
self.loss = CrossEntropyLoss()

self.k = k

@overrides
def forward(
self, # type: ignore
box_features: torch.Tensor,
box_coordinates: torch.Tensor,
box_mask: torch.Tensor,
caption: TextFieldTensors,
label: torch.Tensor,
) -> Dict[str, torch.Tensor]:
batch_size = box_features.shape[0]

if self.training:
# Shape: (batch_size, num_images, pooled_output_dim)
pooled_output = self.backbone(box_features, box_coordinates, box_mask, caption)[
"pooled_boxes_and_text"
]

# Shape: (batch_size, num_images)
logits = self.classifier(pooled_output).squeeze(-1)
probs = torch.softmax(logits, dim=-1)
else:
with torch.no_grad():
# Shape: (batch_size, num_images, pooled_output_dim)
pooled_output = self.backbone(box_features, box_coordinates, box_mask, caption)[
"pooled_boxes_and_text"
]

# Shape: (batch_size, num_images)
logits = self.classifier(pooled_output).squeeze(-1)
probs = torch.softmax(logits, dim=-1)

outputs = {"logits": logits, "probs": probs}
outputs = self._compute_loss_and_metrics(batch_size, outputs, label)
return outputs

@overrides
def _compute_loss_and_metrics(
self,
batch_size: int,
outputs: torch.Tensor,
labels: torch.Tensor,
):
outputs["loss"] = self.loss(outputs["logits"], labels) / batch_size
self.top_1_acc(outputs["logits"], labels)
self.top_5_acc(outputs["logits"], labels)
self.top_10_acc(outputs["logits"], labels)
return outputs

@overrides
def get_metrics(self, reset: bool = False) -> Dict[str, float]:
return {
"top_1_acc": self.top_1_acc.get_metric(reset),
"top_5_acc": self.top_5_acc.get_metric(reset),
"top_10_acc": self.top_10_acc.get_metric(reset),
}

@overrides
def make_output_human_readable(
self, output_dict: Dict[str, torch.Tensor]
) -> Dict[str, torch.Tensor]:
return output_dict

default_predictor = "vilbert_ir"
1 change: 1 addition & 0 deletions allennlp_models/vision/predictors/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
from allennlp_models.vision.predictors.vilbert_ir import VilbertImageRetrievalPredictor
from allennlp_models.vision.predictors.vilbert_vqa import VilbertVqaPredictor
from allennlp_models.vision.predictors.visual_entailment import VisualEntailmentPredictor
40 changes: 40 additions & 0 deletions allennlp_models/vision/predictors/vilbert_ir.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from typing import List, Dict

from overrides import overrides
import numpy

from allennlp.common.file_utils import cached_path
from allennlp.common.util import JsonDict
from allennlp.data import Instance
from allennlp.data.fields import LabelField
from allennlp.predictors.predictor import Predictor


@Predictor.register("vilbert_ir")
class VilbertImageRetrievalPredictor(Predictor):
def predict(self, image: str, caption: str) -> JsonDict:
image = cached_path(image)
return self.predict_json({"caption": caption, "image": image})

@overrides
def _json_to_instance(self, json_dict: JsonDict) -> Instance:
from allennlp_models.vision.dataset_readers.flickr30k import Flickr30kReader

caption = json_dict["caption"]
image = cached_path(json_dict["image"])
if isinstance(self._dataset_reader, Flickr30kReader):
return self._dataset_reader.text_to_instance(caption, image, use_cache=False)
else:
raise ValueError(
f"Dataset reader is of type f{self._dataset_reader.__class__.__name__}. "
f"Expected {Flickr30kReader.__name__}."
)

@overrides
def predictions_to_labeled_instances(
self, instance: Instance, outputs: Dict[str, numpy.ndarray]
) -> List[Instance]:
new_instance = instance.duplicate()
label = numpy.argmax(outputs["probs"])
new_instance.add_field("label", LabelField(int(label), skip_indexing=True))
return [new_instance]
80 changes: 80 additions & 0 deletions test_fixtures/vision/flickr30k/experiment.jsonnet
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
local model_name = "epwalsh/bert-xsmall-dummy";

{
"dataset_reader": {
"type": "flickr30k",
"image_dir": "test_fixtures/vision/images/flickr30k",
"data_dir": "test_fixtures/vision/flickr30k/sentences",
"image_loader": "torch",
"image_featurizer": "null",
"featurize_captions": false,
"region_detector": {
"type": "random",
"seed": 322
},
"tokenizer": {
"type": "pretrained_transformer",
"model_name": model_name
},
"token_indexers": {
"tokens": {
"type": "pretrained_transformer",
"model_name": model_name
}
}
},
"train_data_path": "test_fixtures/vision/flickr30k/tiny-dev.txt",
"validation_data_path": "test_fixtures/vision/flickr30k/tiny-dev.txt",
"model": {
"type": "vilbert_ir",
"text_embeddings": {
"vocab_size": 250,
"embedding_size": 20,
"pad_token_id": 0,
"max_position_embeddings": 512,
"type_vocab_size": 2,
"dropout": 0.0
},
"image_embeddings": {
"feature_size": 10,
"embedding_size": 200
},
"encoder": {
# text
"hidden_size1": 20,
"num_hidden_layers1": 1,
"intermediate_size1": 40,
"num_attention_heads1": 1,
"attention_dropout1": 0.1,
"hidden_dropout1": 0.1,
"biattention_id1": [0, 1],
"fixed_layer1": 0,

# vision
"hidden_size2": 200,
"num_hidden_layers2": 1,
"intermediate_size2": 50,
"num_attention_heads2": 1,
"attention_dropout2": 0.0,
"hidden_dropout2": 0.0,
"biattention_id2": [0, 1],
"fixed_layer2": 0,

"combined_num_attention_heads": 2,
"combined_hidden_size": 200,
"activation": "gelu",
},
"pooled_output_dim": 100,
"fusion_method": "sum",
},
"data_loader": {
"batch_size": 4
},
"trainer": {
"optimizer": {
"type": "huggingface_adamw",
"lr": 0.00005
},
"num_epochs": 1,
}
}
60 changes: 60 additions & 0 deletions test_fixtures/vision/flickr30k/experiment_from_huggingface.jsonnet
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
local model_name = "epwalsh/bert-xsmall-dummy";
{
"dataset_reader": {
"type": "flickr30k",
"image_dir": "test_fixtures/vision/images/flickr30k",
"data_dir": "test_fixtures/vision/flickr30k/sentences",
"image_loader": "torch",
"image_featurizer": "null",
"featurize_captions": false,
"region_detector": {
"type": "random",
"seed": 322
},
"tokenizer": {
"type": "pretrained_transformer",
"model_name": model_name
},
"token_indexers": {
"tokens": {
"type": "pretrained_transformer",
"model_name": model_name
}
}
},
"train_data_path": "test_fixtures/vision/flickr30k/tiny-dev.txt",
"validation_data_path": "test_fixtures/vision/flickr30k/tiny-dev.txt",
"model": {
"type": "vilbert_ir_from_huggingface",
"model_name": model_name,
"image_feature_dim": 10,
"image_num_hidden_layers": 1,
"image_hidden_size": 200,
"image_num_attention_heads": 1,
"image_intermediate_size": 50,
"image_attention_dropout": 0.0,
"image_hidden_dropout": 0.0,
"image_biattention_id": [0, 1],
"image_fixed_layer": 0,

"text_biattention_id": [0, 1],
"text_fixed_layer": 0,

"combined_hidden_size": 200,
"combined_num_attention_heads": 4,

"pooled_output_dim": 100,
"fusion_method": "sum",
"pooled_dropout": 0.0,
},
"data_loader": {
"batch_size": 32
},
"trainer": {
"optimizer": {
"type": "huggingface_adamw",
"lr": 0.00005
},
"num_epochs": 1,
}
}
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/sentences/1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[/EN#221796/people A girl] with [/EN#221804/bodyparts brown hair] sits on [/EN#221799/scene the edge of a cement area] [/EN#221798/scene overlooking water] .
[/EN#221796/people A woman] in [/EN#221797/clothing black] , seen from [/EN#221800/other behind] , sits next to [/EN#221798/scene a body of water] .
[/EN#221796/people A girl] sitting outside on [/EN#221799/other concrete] near [/EN#221798/scene water] in [/EN#221797/clothing a black dress] .
[/EN#221796/people A small girl] sits on [/EN#221799/other a ledge] by [/EN#221798/scene the water] contemplating [/EN#221802/other life] .
[/EN#221796/people A dark-haired girl] is sitting on [/EN#221798/scene the waters edge] .
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/sentences/2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[/EN#221796/people A girl] with [/EN#221804/bodyparts brown hair] sits on [/EN#221799/scene the edge of a concrete area] [/EN#221798/scene overlooking water] .
[/EN#221796/people A woman] in [/EN#221797/clothing black] , seen from [/EN#221800/other behind] , sits by [/EN#221798/scene a body of water] .
[/EN#221796/people A girl] sitting outside on [/EN#221799/other cement] near [/EN#221798/scene water] in [/EN#221797/clothing a black dress] .
[/EN#221796/people A small girl] sits on [/EN#221799/other an edge] by [/EN#221798/scene the water] contemplating [/EN#221802/other life] .
[/EN#221796/people A dark-haired girl] is sitting next to [/EN#221798/scene the waters edge] .
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/sentences/3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[/EN#221796/people A girl] without [/EN#221804/bodyparts brown hair] sits on [/EN#221799/scene the edge of a cement area] [/EN#221798/scene overlooking water] .
[/EN#221796/people A woman] wearing [/EN#221797/clothing black] , seen from [/EN#221800/other behind] , sits next to [/EN#221798/scene a body of water] .
[/EN#221796/people A girl] sitting inside on [/EN#221799/other concrete] near [/EN#221798/scene water] in [/EN#221797/clothing a black dress] .
[/EN#221796/people A small girl] sits on top of [/EN#221799/other a ledge] by [/EN#221798/scene the water] contemplating [/EN#221802/other life] .
[/EN#221796/people A dark-haired girl] is sitting by [/EN#221798/scene the waters edge] .
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/sentences/4945942737.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[/EN#221796/people A girl] with [/EN#221804/bodyparts brown hair] sits on [/EN#221799/scene the edge of a cement area] [/EN#221798/scene overlooking water] .
[/EN#221796/people A woman] in [/EN#221797/clothing black] , seen from [/EN#221800/other behind] , sits next to [/EN#221798/scene a body of water] .
[/EN#221796/people A girl] sitting outside on [/EN#221799/other concrete] near [/EN#221798/scene water] in [/EN#221797/clothing a black dress] .
[/EN#221796/people A small girl] sits on [/EN#221799/other a ledge] by [/EN#221798/scene the water] contemplating [/EN#221802/other life] .
[/EN#221796/people A dark-haired girl] is sitting on [/EN#221798/scene the waters edge] .
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/sentences/6338542128.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
On [/EN#253080/scene a sunny , dry day] , wearing [/EN#253081/other full football gear] , [/EN#253069/people a Texas A&M football player] tries to reach [/EN#253070/people an Iowa State football player] , for [/EN#253072/other the football] during [/EN#253078/other the game] .
[/EN#253070/people An offensive player] running with [/EN#253077/other a football] while [/EN#253069/people a football player] tries to stop [/EN#0/notvisual him] during [/EN#253071/other a football game] .
[/EN#253069/people A football player] from [/EN#253074/scene Iowa State blocks] [/EN#253069/people a player] from [/EN#253075/other Texas A&M] from taking [/EN#253072/other the football] from [/EN#0/notvisual him] .
[/EN#253070/scene The Iowa State football player blocks] [/EN#253068/people a Texas A&M defenseman] while running with [/EN#253072/other the ball] .
[/EN#253073/other # 8] for [/EN#253083/bodyparts Iowa State stiff arms] [/EN#253069/people a Texas AM player] attempting to tackle [/EN#0/notvisual him] .
5 changes: 5 additions & 0 deletions test_fixtures/vision/flickr30k/test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
6338542128
4945942737
1
2
3
Loading