Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biaffiane parser not works with toc2vec backend #10527

Closed
zsozso21 opened this issue Mar 19, 2022 · 3 comments
Closed

Biaffiane parser not works with toc2vec backend #10527

zsozso21 opened this issue Mar 19, 2022 · 3 comments
Labels
experimental Experimental components and features

Comments

@zsozso21
Copy link

I applied the experimental Biaffine parser based on this example and it works well when I use the transformer based architecture, but I got the following error when I tried to apply it with a toc2vec model by using cpu:

Traceback (most recent call last):
  File "/home/a100/zsozso/deploy/.venv/bin/spacy", line 8, in <module>
    sys.exit(setup_cli())
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/_util.py", line 71, in setup_cli
    command(prog_name=COMMAND)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/train.py", line 45, in train_cli
    train(config_path, output_path, use_gpu=use_gpu, overrides=overrides)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/train.py", line 75, in train
    train_nlp(nlp, output_path, use_gpu=use_gpu, stdout=sys.stdout, stderr=sys.stderr)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/loop.py", line 122, in train
    raise e
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/loop.py", line 105, in train
    for batch, info, is_best_checkpoint in training_step_iterator:
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/loop.py", line 203, in train_while_improving
    nlp.update(
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/language.py", line 1156, in update
    proc.update(examples, sgd=None, losses=losses, **component_cfg[name])  # type: ignore
  File "spacy_experimental/biaffine_parser/arc_predicter.pyx", line 220, in spacy_experimental.biaffine_parser.arc_predicter.ArcPredicter.update
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 60, in backprop
    dX = callback(dY)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/layers/pytorchwrapper.py", line 139, in backprop
    dXtorch = torch_backprop(dYtorch)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/shims/pytorch.py", line 105, in backprop
    grads.kwargs["grad_tensors"] = self._grad_scaler.scale(
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/shims/pytorch_grad_scaler.py", line 97, in scale
    self._scale_tensor(tensor, scale_per_device, inplace)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/shims/pytorch_grad_scaler.py", line 110, in _scale_tensor
    assert tensor.is_cuda, "Gradient scaling is only supported for CUDA tensors"
AssertionError: Gradient scaling is only supported for CUDA tensors

And I got this error with GPU:

Traceback (most recent call last):
  File "/home/a100/zsozso/deploy/.venv/bin/spacy", line 8, in <module>
    sys.exit(setup_cli())
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/_util.py", line 71, in setup_cli
    command(prog_name=COMMAND)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/train.py", line 45, in train_cli
    train(config_path, output_path, use_gpu=use_gpu, overrides=overrides)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/cli/train.py", line 72, in train
    nlp = init_nlp(config, use_gpu=use_gpu)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/initialize.py", line 84, in init_nlp
    nlp.initialize(lambda: train_corpus(nlp), sgd=optimizer)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/language.py", line 1286, in initialize
    init_vocab(
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/initialize.py", line 131, in init_vocab
    load_vectors_into_model(nlp, vectors)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/training/initialize.py", line 152, in load_vectors_into_model
    vectors_nlp = load_model(name, vocab=nlp.vocab, exclude=exclude)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/util.py", line 422, in load_model
    return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/util.py", line 489, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/language.py", line 2042, in from_disk
    util.from_disk(path, deserializers, exclude)  # type: ignore[arg-type]
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/util.py", line 1299, in from_disk
    reader(path / key)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/language.py", line 2018, in deserialize_vocab
    self.vocab.from_disk(path, exclude=exclude)
  File "spacy/vocab.pyx", line 460, in spacy.vocab.Vocab.from_disk
  File "spacy/vectors.pyx", line 616, in spacy.vectors.Vectors.from_disk
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/spacy/util.py", line 1299, in from_disk
    reader(path / key)
  File "spacy/vectors.pyx", line 602, in spacy.vectors.Vectors.from_disk.load_vectors
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/cupy/_io/npz.py", line 71, in load
    return cupy.array(obj)
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/cupy/_creation/from_data.py", line 41, in array
    return _core.array(obj, dtype, copy, order, subok, ndmin)
  File "cupy/_core/core.pyx", line 2165, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2244, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2318, in cupy._core.core._send_object_to_gpu
  File "cupy/_core/core.pyx", line 167, in cupy._core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 718, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1395, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1416, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1096, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1117, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 1332, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
  File "cupy/cuda/memory.pyx", line 1067, in cupy.cuda.memory.SingleDeviceMemoryPool._alloc
  File "/home/a100/zsozso/deploy/.venv/lib/python3.8/site-packages/thinc/backends/_cupy_allocators.py", line 52, in cupy_pytorch_allocator
    torch_tensor = torch.zeros((size_in_bytes // 4,), requires_grad=False)

How to reproduce the behaviour

I used the following config:

[paths]
# We need to define these variables in order to override them through `spacy train`
init_tok2vec = null
vectors = null
train = null
dev = null

[system]
gpu_allocator = "pytorch"
seed = 0

[nlp]
lang = "hu"
pipeline = ["tok2vec","tagger","morphologizer","senter","experimental_arc_predicter","experimental_arc_labeler"]
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
batch_size = 256

[components]

[components.senter]
factory = "senter"

[components.senter.model]
@architectures = "spacy.Tagger.v1"
nO = null

[components.senter.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.morphologizer]
factory = "morphologizer"

[components.morphologizer.model]
@architectures = "spacy.Tagger.v1"
nO = null

[components.morphologizer.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.experimental_arc_labeler]
factory = "experimental_arc_labeler"

[components.experimental_arc_labeler.model]
@architectures = "spacy-experimental.Bilinear.v1"
hidden_width = 128
mixed_precision = true

[components.experimental_arc_labeler.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.experimental_arc_predicter]
factory = "experimental_arc_predicter"

[components.experimental_arc_predicter.model]
@architectures = "spacy-experimental.PairwiseBilinear.v1"
hidden_width = 256
nO = 1
mixed_precision = true

[components.experimental_arc_predicter.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.tagger]
factory = "tagger"

[components.tagger.model]
@architectures = "spacy.Tagger.v1"
nO = null

[components.tagger.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.tok2vec]
factory = "tok2vec"

[components.tok2vec.model]
@architectures = "spacy.Tok2Vec.v2"

[components.tok2vec.model.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = ${components.tok2vec.model.encode.width}
attrs = ["LOWER","PREFIX","SUFFIX","SHAPE"]
rows = [5000,2500,2500,2500]
include_static_vectors = true

[components.tok2vec.model.encode]
@architectures = "spacy.MaxoutWindowEncoder.v2"
width = 300
depth = 4
window_size = 2
maxout_pieces = 5

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 2000
gold_preproc = false
limit = 0
augmenter = null

[training]
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
accumulate_gradient = 1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = []
before_to_disk = null
annotating_components = ["senter"]

[training.batcher]
@batchers = "spacy.batch_by_words.v1"
discard_oversize = false
tolerance = 0.2
get_length = null

[training.batcher.size]
@schedules = "compounding.v1"
start = 100
stop = 1000
compound = 1.001
t = 0.0

[training.logger]
@loggers = "spacy.WandbLogger.v2"
project_name = "Exp-parser"

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = true
eps = 0.00000001

[training.optimizer.learn_rate]
@schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005


[training.score_weights]
tag_acc = 0.2
pos_acc = 0.2
morph_acc = 0.2
morph_per_feat = null
dep_uas = 0.0
dep_las = 0.2
dep_las_per_type = null
bound_dep_uas = 0.0
bound_dep_las = 0.0
sents_p = null
sents_r = null
sents_f = 0.2

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.tokenizer]

Your Environment

@danieldk
Copy link
Contributor

Thanks for the report!

The first error occurs because CPUs don't support mixed-precision. You could set both instances of

mixed_precision = true

to

mixed_precision = false

in the configuration. Admittedly, this is not very convenient, I'll look into disabling mixed-precision altogether when running on CPU, that would probably be nicer than the current assertion.

I have to look into the second error in more detail, though it seems that the trace is not completely pasted?

Fair warning ahead: the accuracy of the biaffine parser is not great yet with a convolutional tok2vec layer. I am currently also working on a set of changes that also improve accuracy when training a transformer model quite a bit.

@svlandeg
Copy link
Member

Closing as the first issue was addressed with explosion/thinc#624. If you still run into issues with the second error, feel free to open a new issue with the full stack trace!

@github-actions
Copy link
Contributor

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
experimental Experimental components and features
Projects
None yet
Development

No branches or pull requests

4 participants