Attention Softmax

Softmax(zi) = ezi/T / 危ezj/T

Controls

1.00

Large = Uniform. Small = Sharp.

z13.5
z21.0
z3-2.0
z40.5

Transformation Pipeline

1. Scaled Logits

zi / T

3.5
z1
1.0
z2
-2.0
z3
0.5
z4

2. Exponentiate

e(zi/T)

33.12
e1
2.72
e2
0.14
e3
1.65
e4

3. Softmax

Normalized Weights

88%
w1
7%
w2
0%
w3
4%
w4
危 = 1.0