Attention Softmax
Softmax(zi) = ezi/T / 危ezj/T
Controls
1.00
Large = Uniform. Small = Sharp.
z13.5
z21.0
z3-2.0
z40.5
Transformation Pipeline
1. Scaled Logits
zi / T
3.5
z1
1.0
z2
-2.0
z3
0.5
z4
→
2. Exponentiate
e(zi/T)
33.12
e1
2.72
e2
0.14
e3
1.65
e4
→
3. Softmax
Normalized Weights
88%
w1
7%
w2
0%
w3
4%
w4
危 = 1.0