MaskedLMHead should support dtype=bfloat16 #1195

g-dspencer · 2023-08-04T17:42:25Z

Describe the bug

I claim that MaskedLMHead should support a dtype argument of tf.bfloat16 (and tf.float16) so that users can look at the effect of reducing their memory usage. This matters more as the vocab gets larger.

To Reproduce

In google corp colab I do "File -> Save a copy as GitHub Gist", enter an OTP, and then there is a message that "github auth fails" so I'll just include the code inline:

!pip install keras-nlp --upgrade --quiet

import tensorflow as tf
import keras_nlp

# Based on test_valid_call()
# https:/keras-team/keras-nlp/blob/master/keras_nlp/layers/modeling/masked_lm_head_test.py#L25
def test_dtype(dtype):
  head = keras_nlp.layers.MaskedLMHead(
      vocabulary_size=100,
      activation="softmax",
      dtype=dtype, # this is the point
  )
  encoded_tokens = tf.keras.Input(shape=(10, 16))
  positions = tf.keras.Input(shape=(5,), dtype="int32")
  outputs = head(encoded_tokens, mask_positions=positions)
  model = tf.keras.Model((encoded_tokens, positions), outputs)

  token_data = tf.random.uniform(shape=(4, 10, 16))
  position_data = tf.random.uniform(minval=0, maxval=10, shape=(4, 5), dtype=tf.int32)
  model((token_data, position_data))

  for w in head.weights:
      assert w.dtype == dtype, ("Wrong type: " + w.name)
      # When it fails it fails with:
      # TypeError: Input 'y' of 'AddV2' Op has type float16 that does not match type float32 of argument 'x'.

print("float32")
test_dtype(tf.float32) # this works

print("bfloat16")
test_dtype(tf.bfloat16) # this fails

print("float64")
test_dtype(tf.float64)

Expected behavior

Lack of a crash.
The loop checking dtypes (assert w.dtype == dtype, ("Wrong type: " + w.name)) should arguably pass - unless we
are hitting some subtle case of wanting mixed types.

Additional context

The error I get is:

TypeError: Exception encountered when calling layer "masked_lm_head_1" (type MaskedLMHead).

in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras_nlp/src/layers/modeling/masked_lm_head.py", line 196, in call  *
        outputs = outputs + self._bias

    TypeError: Input 'y' of 'AddV2' Op has type bfloat16 that does not match type float32 of argument 'x'.


Call arguments received by layer "masked_lm_head_1" (type MaskedLMHead):
  • inputs=tf.Tensor(shape=(None, 10, 16), dtype=bfloat16)
  • mask_positions=tf.Tensor(shape=(None, 5), dtype=int32)

and I suspect we need to pass in a few dtype= parameters in the code.

Would you like to help us fix it?
yes

The text was updated successfully, but these errors were encountered:

vulkomilev · 2023-08-09T18:58:30Z

hi I have developed a solution to your problem
but it works only for the output kernel and bias
<dtype: 'bfloat16'> <dtype: 'bfloat16'>
masked_lm_head/output_kernel:0
<dtype: 'bfloat16'> <dtype: 'bfloat16'>
masked_lm_head/output_bias:0
<dtype: 'float32'> <dtype: 'bfloat16'>
masked_lm_head/kernel:0
<dtype: 'float32'> <dtype: 'bfloat16'>
masked_lm_head/bias:0
<dtype: 'float32'> <dtype: 'bfloat16'>
masked_lm_head/gamma:0
<dtype: 'float32'> <dtype: 'bfloat16'>
is this sufficient ?

mattdangerw · 2023-09-21T17:53:04Z

This is fixed on #1242, and we have a test enforcing this for all layers.

g-dspencer added the type:Bug Something isn't working label Aug 4, 2023

This was referenced Aug 4, 2023

Update MaskedLMHead to support dtype=bfloat16/float16/float64. #1196

Closed

Update MaskedLMHead to support dtype=bfloat16/float16/float64 #1197

Draft

mattdangerw closed this as completed Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaskedLMHead should support dtype=bfloat16 #1195

MaskedLMHead should support dtype=bfloat16 #1195

g-dspencer commented Aug 4, 2023 •

edited

Loading

vulkomilev commented Aug 9, 2023

mattdangerw commented Sep 21, 2023

MaskedLMHead should support dtype=bfloat16 #1195

MaskedLMHead should support dtype=bfloat16 #1195

Comments

g-dspencer commented Aug 4, 2023 • edited Loading

vulkomilev commented Aug 9, 2023

mattdangerw commented Sep 21, 2023

g-dspencer commented Aug 4, 2023 •

edited

Loading