Multi-Head Attention
MultiHeadAttention
Bases: Module
A class that implements a Multi-head Attention mechanism. Multi-head attention allows the model to focus on different positions, capturing various aspects of the input.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query_dim |
int
|
The dimensionality of the query. |
required |
key_dim |
int
|
The dimensionality of the key. |
required |
num_units |
int
|
The total number of dimensions of the output. |
required |
num_heads |
int
|
The number of parallel attention layers (multi-heads). |
required |
query, and key
- query: Tensor of shape [N, T_q, query_dim]
- key: Tensor of shape [N, T_k, key_dim]
Outputs
- An output tensor of shape [N, T_q, num_units]
Source code in models/tts/delightful_tts/attention/multi_head_attention.py
forward(query, key)
Performs the forward pass over input tensors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Tensor
|
The input tensor containing query vectors. It is expected to have the dimensions [N, T_q, query_dim] where N is the batch size, T_q is the sequence length of queries, and query_dim is the dimensionality of a single query vector. |
required |
key |
Tensor
|
The input tensor containing key vectors. It is expected to have the dimensions [N, T_k, key_dim] where N is the batch size, T_k is the sequence length of keys, and key_dim is the dimensionality of a single key vector. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
torch.Tensor: The output tensor of shape [N, T_q, num_units] which represents the results of the multi-head attention mechanism applied on the provided queries and keys. |