ML VISUAL LAB / ML Methods
Transformer Attention
Intuition
Enter your own sentence and inspect Q, K, V, scaled dot-product, softmax, and the attention matrix.
Core Formula
Attention(Q,K,V) = softmax(QK^T / sqrt(d))V
When to use it?
Language processing and modern generative AI models.