BAAI
/

bge-visualized

Model card Files Files and versions Community

RuntimeError: Expected attn_mask dtype to be bool or float or to match query dtype, but got attn_mask.dtype: c10::Half and query.dtype: float instead.

by BiXie - opened Sep 20, 2024

Sep 20, 2024

when i run code " candi_emb_3 = model.encode(text="The Mid-Hudson Bridge was designated as a New York State Historic Civil Engineering Landmark by the American Society of Civil Engineers in 1983. The bridge was renamed the "Franklin Delano Roosevelt Mid-Hudson Bridge" in 1994.")" .
I got the error "RuntimeError: Expected attn_mask dtype to be bool or float or to match query dtype, but got attn_mask.dtype: c10::Half and query.dtype: float instead."
But when I run the code with image,there is no problem. that's why?

JUNJIE99

Beijing Academy of Artificial Intelligence org Sep 20, 2024

Hello.
It seems that this issue was fixed before. Are you using the latest code? https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/visual
If the problem still exists, could you provide detailed error information, including the specific line of code where the error occurs?
Thank you.

Someshfengde

Sep 25, 2024

Hi @JUNJIE99 I'm also experiencing same error can you please check.

I've also opened Issue on github https://github.com/FlagOpen/FlagEmbedding/issues/1121

Someshfengde

Sep 25, 2024

@BiXie have you solved it? if you have can you please share steps you took to solve ?

Someshfengde

Sep 25, 2024

@BiXie @JUNJIE99

I've resolved it by setting dtype to torch.float32 instead of torch.float16 at FlagEmbedding/FlagEmbedding/visual/modeling.py

modified function

    
    def get_extended_attention_mask(
        self, attention_mask: Tensor, input_shape: Tuple[int], device: torch.device = None, dtype: torch.float = torch.float32
    ) -> Tensor:
        """
        Makes broadcastable attention and causal masks so that future and masked tokens are ignored.

        Arguments:
            attention_mask (`torch.Tensor`):
                Mask with ones indicating tokens to attend to, zeros for tokens to ignore.
            input_shape (`Tuple[int]`):
                The shape of the input to the model.

        Returns:
            `torch.Tensor` The extended attention mask, with a the same dtype as `attention_mask.dtype`.
        """
        
        # We can provide a self-attention mask of dimensions [batch_size, from_seq_length, to_seq_length]
        # ourselves in which case we just need to make it broadcastable to all heads.
        if attention_mask.dim() == 3:
            extended_attention_mask = attention_mask[:, None, :, :]
        elif attention_mask.dim() == 2:
            # Provided a padding mask of dimensions [batch_size, seq_length]
            # - if the model is a decoder, apply a causal mask in addition to the padding mask
            # - if the model is an encoder, make the mask broadcastable to [batch_size, num_heads, seq_length, seq_length]
            
            extended_attention_mask = attention_mask[:, None, None, :]
        else:
            raise ValueError(
                f"Wrong shape for input_ids (shape {input_shape}) or attention_mask (shape {attention_mask.shape})"
            )

        # Since attention_mask is 1.0 for positions we want to attend and 0.0 for
        # masked positions, this operation will create a tensor which is 0.0 for
        # positions we want to attend and the dtype's smallest value for masked positions.
        # Since we are adding it to the raw scores before the softmax, this is
        # effectively the same as removing these entirely.
        extended_attention_mask = extended_attention_mask.to(dtype=dtype)  # fp16 compatibility
        extended_attention_mask = (1.0 - extended_attention_mask) * torch.finfo(dtype).min
        
        return extended_attention_mask

JUNJIE99

Beijing Academy of Artificial Intelligence org Sep 25, 2024

Hello. I have rechecked the code, and the issue is caused by the data type used during inference. Your solution is a temporary workaround when using FP32 for inference. However, it may not work if FP16 is used for inference.

I will update the code soon to ensure that inference with any data type does not result in errors.

Sorry for the inconvenience.

JUNJIE99

Beijing Academy of Artificial Intelligence org Sep 25, 2024

I have updated the code, it should be fine now.

Ray-17

Nov 14, 2024

Sorry, I cannot find where you updated. Is it in the JUNJIE99-patch-1 branch? The latest commit in this branch was in June.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment