Skip to content

Commit 50d5f51

Browse files
authored
Merge pull request #10 from AbdelStark/claude/polish-frontend-ui-kAaW5
feat(web): polish frontend UI for visual excellence and accessibility
2 parents 1b1e7be + eddda63 commit 50d5f51

5 files changed

Lines changed: 792 additions & 143 deletions

File tree

web-demo/index.html

Lines changed: 73 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -3,26 +3,40 @@
33
<head>
44
<meta charset="UTF-8" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
6+
<meta name="description" content="Interactive demo of Attention Residuals — replacing fixed residual connections with learned softmax attention over depth. Built with Rust + WASM." />
7+
<meta name="theme-color" content="#2563eb" media="(prefers-color-scheme: light)" />
8+
<meta name="theme-color" content="#60a5fa" media="(prefers-color-scheme: dark)" />
69
<title>Attention Residuals — Interactive Demo</title>
710
<link rel="preconnect" href="https://fonts.googleapis.com" />
811
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
912
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&family=Source+Serif+4:ital,wght@0,400;0,600;0,700;1,400&display=swap" rel="stylesheet" />
1013
<link rel="stylesheet" href="/src/style.css" />
1114
</head>
1215
<body>
16+
<!-- Skip to content for keyboard users -->
17+
<a href="#demo" class="skip-link">Skip to interactive demo</a>
18+
1319
<!-- ─── Navigation ──────────────────────────────────────────── -->
14-
<nav class="nav">
20+
<nav class="nav" role="navigation" aria-label="Main navigation">
1521
<div class="nav-inner">
16-
<a href="#" class="nav-logo">
17-
<span class="nav-logo-symbol">&#x03B1;</span>
22+
<a href="#top" class="nav-logo" aria-label="AttnRes — back to top">
23+
<span class="nav-logo-symbol" aria-hidden="true">&#x03B1;</span>
1824
<span>AttnRes</span>
1925
</a>
20-
<div class="nav-links">
21-
<a href="#problem">Problem</a>
22-
<a href="#algorithm">Algorithm</a>
23-
<a href="#demo">Live Demo</a>
24-
<a href="#training">Training</a>
25-
<a href="https://github.com/AbdelStark/attnres-rs" target="_blank" rel="noopener">GitHub</a>
26+
<button class="nav-toggle" aria-expanded="false" aria-controls="nav-links" aria-label="Toggle navigation menu">
27+
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" aria-hidden="true">
28+
<line x1="3" y1="6" x2="21" y2="6" />
29+
<line x1="3" y1="12" x2="21" y2="12" />
30+
<line x1="3" y1="18" x2="21" y2="18" />
31+
</svg>
32+
</button>
33+
<div class="nav-links" id="nav-links" role="list">
34+
<a href="#problem" role="listitem">Problem</a>
35+
<a href="#algorithm" role="listitem">Algorithm</a>
36+
<a href="#demo" role="listitem">Live Demo</a>
37+
<a href="#training" role="listitem">Training</a>
38+
<a href="#comparison" role="listitem">Comparison</a>
39+
<a href="https://github.com/AbdelStark/attnres-rs" target="_blank" rel="noopener" role="listitem">GitHub</a>
2640
</div>
2741
</div>
2842
</nav>
@@ -40,8 +54,8 @@ <h1 class="hero-title">Attention Residuals</h1>
4054
Paper: <em>Attention as a Hypernetwork</em> (MoonshotAI / Kimi) &middot;
4155
Implementation: <code>attnres-rs</code> (burn framework)
4256
</p>
43-
<div class="hero-status" id="wasm-status">
44-
<span class="status-dot loading"></span>
57+
<div class="hero-status" id="wasm-status" role="status" aria-live="polite">
58+
<span class="status-dot loading" aria-hidden="true"></span>
4559
<span>Loading WASM engine&hellip;</span>
4660
</div>
4761
</div>
@@ -57,7 +71,7 @@ <h2>The Problem with Standard Residuals</h2>
5771
<p>
5872
In standard Transformers, the residual connection is a simple addition:
5973
</p>
60-
<div class="equation">
74+
<div class="equation" role="math" aria-label="h sub l plus 1 equals h sub l plus F sub l of h sub l">
6175
h<sub>l+1</sub> = h<sub>l</sub> + F<sub>l</sub>(h<sub>l</sub>)
6276
</div>
6377
<p>
@@ -84,7 +98,7 @@ <h2>The Problem with Standard Residuals</h2>
8498
<div class="col-viz">
8599
<div class="diagram" id="standard-residual-diagram">
86100
<div class="diagram-title">Standard Residual</div>
87-
<canvas id="canvas-standard" width="320" height="400"></canvas>
101+
<canvas id="canvas-standard" width="320" height="400" aria-label="Diagram showing standard residual connections with equal +1 weights between layers"></canvas>
88102
<div class="diagram-caption">
89103
All layers contribute equally (weight = 1).
90104
<br />No selectivity over depth.
@@ -101,74 +115,74 @@ <h2>The Problem with Standard Residuals</h2>
101115
<div class="section-label">02</div>
102116
<h2>Attention Residuals: The Algorithm</h2>
103117

104-
<div class="algo-steps">
105-
<div class="algo-step">
106-
<div class="algo-step-num">1</div>
118+
<div class="algo-steps" role="list">
119+
<div class="algo-step" role="listitem">
120+
<div class="algo-step-num" aria-hidden="true">1</div>
107121
<div class="algo-step-content">
108122
<h3>Stack block representations</h3>
109123
<p>
110124
Collect all completed block sums <strong>b<sub>0</sub>, &hellip;, b<sub>n-1</sub></strong>
111125
plus the current partial block into a value matrix.
112126
</p>
113-
<div class="equation">
127+
<div class="equation" role="math">
114128
V = [b<sub>0</sub>; b<sub>1</sub>; &hellip;; b<sub>n</sub><sup>(partial)</sup>] &ensp;&isin;&ensp; &Ropf;<sup>(N+1) &times; D</sup>
115129
</div>
116130
</div>
117131
</div>
118132

119-
<div class="algo-step">
120-
<div class="algo-step-num">2</div>
133+
<div class="algo-step" role="listitem">
134+
<div class="algo-step-num" aria-hidden="true">2</div>
121135
<div class="algo-step-content">
122136
<h3>Normalize keys with RMSNorm</h3>
123137
<p>
124138
Prevent large-magnitude blocks from dominating attention logits.
125139
Without this, deeper blocks (which accumulate more layer outputs)
126140
would receive disproportionate weight.
127141
</p>
128-
<div class="equation">
142+
<div class="equation" role="math">
129143
K = RMSNorm(V) = (V / &radic;mean(V&sup2;)) &middot; &gamma;
130144
</div>
131145
</div>
132146
</div>
133147

134-
<div class="algo-step">
135-
<div class="algo-step-num">3</div>
148+
<div class="algo-step" role="listitem">
149+
<div class="algo-step-num" aria-hidden="true">3</div>
136150
<div class="algo-step-content">
137151
<h3>Compute depth attention logits</h3>
138152
<p>
139153
A learned pseudo-query <strong>w<sub>l</sub></strong> &isin; &Ropf;<sup>D</sup>
140154
scores each block. Crucially, w is <strong>initialized to zero</strong> &mdash;
141155
ensuring the model starts as a standard residual and smoothly transitions.
142156
</p>
143-
<div class="equation">
157+
<div class="equation" role="math">
144158
logits<sub>i</sub> = K<sub>i</sub> &middot; w<sub>l</sub> &ensp;&ensp; &forall; i &isin; {0, &hellip;, N}
145159
</div>
146160
</div>
147161
</div>
148162

149-
<div class="algo-step">
150-
<div class="algo-step-num">4</div>
163+
<div class="algo-step" role="listitem">
164+
<div class="algo-step-num" aria-hidden="true">4</div>
151165
<div class="algo-step-content">
152166
<h3>Softmax over <em>depth</em></h3>
153167
<p>
154168
The softmax is taken <strong>over the block/depth dimension</strong>, not the
155169
sequence dimension. This is attention over <em>layers</em>, not over <em>tokens</em>.
156170
</p>
157-
<div class="equation">
171+
<div class="equation" role="math">
158172
&alpha;<sub>i</sub> = softmax(logits)<sub>i</sub> = exp(logits<sub>i</sub>) / &sum;<sub>j</sub> exp(logits<sub>j</sub>)
159173
</div>
160174
</div>
161175
</div>
162176

163-
<div class="algo-step">
164-
<div class="algo-step-num">5</div>
177+
<div class="algo-step" role="listitem">
178+
<div class="algo-step-num" aria-hidden="true">5</div>
165179
<div class="algo-step-content">
166180
<h3>Weighted combination</h3>
167181
<p>
168182
The output is a learned convex combination of all block representations.
169183
Each layer can choose exactly how much information to draw from each depth.
170184
</p>
171-
<div class="equation">
185+
<div class="equation" role="math">
172186
h = &sum;<sub>i</sub> &alpha;<sub>i</sub> &middot; V<sub>i</sub>
173187
</div>
174188
</div>
@@ -198,18 +212,18 @@ <h2>Interactive: Core AttnRes Operation</h2>
198212
<div class="demo-panel">
199213
<div class="demo-controls">
200214
<div class="control-group">
201-
<label>Model Configuration</label>
202-
<div class="control-row">
215+
<label id="config-label">Model Configuration</label>
216+
<div class="control-row" role="group" aria-labelledby="config-label">
203217
<div class="control">
204-
<span class="control-label">d_model</span>
218+
<label class="control-label" for="cfg-d-model">d_model</label>
205219
<select id="cfg-d-model">
206220
<option value="16">16</option>
207221
<option value="32" selected>32</option>
208222
<option value="64">64</option>
209223
</select>
210224
</div>
211225
<div class="control">
212-
<span class="control-label">Layers (sublayers)</span>
226+
<label class="control-label" for="cfg-layers">Layers (sublayers)</label>
213227
<select id="cfg-layers">
214228
<option value="4">4</option>
215229
<option value="8" selected>8</option>
@@ -218,14 +232,14 @@ <h2>Interactive: Core AttnRes Operation</h2>
218232
</select>
219233
</div>
220234
<div class="control">
221-
<span class="control-label">Blocks</span>
235+
<label class="control-label" for="cfg-blocks">Blocks</label>
222236
<select id="cfg-blocks">
223237
<option value="2" selected>2</option>
224238
<option value="4">4</option>
225239
</select>
226240
</div>
227241
<div class="control">
228-
<span class="control-label">Heads</span>
242+
<label class="control-label" for="cfg-heads">Heads</label>
229243
<select id="cfg-heads">
230244
<option value="2">2</option>
231245
<option value="4" selected>4</option>
@@ -237,12 +251,13 @@ <h2>Interactive: Core AttnRes Operation</h2>
237251
</div>
238252

239253
<div class="control-group" id="query-controls" style="display:none">
240-
<label>Pseudo-Query Magnitude</label>
254+
<label for="query-magnitude">Pseudo-Query Magnitude</label>
241255
<p class="control-hint">
242256
Drag the slider to simulate w<sub>l</sub> evolving away from zero during training.
243257
</p>
244-
<input type="range" id="query-magnitude" min="0" max="100" value="0" class="slider" />
245-
<div class="slider-labels">
258+
<input type="range" id="query-magnitude" min="0" max="100" value="0" class="slider"
259+
aria-valuemin="0" aria-valuemax="1" aria-valuenow="0" aria-valuetext="0.00 (uniform)" />
260+
<div class="slider-labels" aria-hidden="true">
246261
<span>0.0 (uniform)</span>
247262
<span id="query-mag-display">0.00</span>
248263
<span>1.0 (selective)</span>
@@ -258,7 +273,7 @@ <h2>Interactive: Core AttnRes Operation</h2>
258273
<div class="result-card result-card-wide">
259274
<div class="result-card-header">Depth Attention Weights</div>
260275
<div class="result-card-body">
261-
<canvas id="canvas-heatmap" width="800" height="300"></canvas>
276+
<canvas id="canvas-heatmap" width="800" height="300" aria-label="Heatmap showing depth attention weights across sublayers and source blocks"></canvas>
262277
</div>
263278
<div class="result-card-footer">
264279
Rows: sublayers (Attn/MLP at each transformer layer). Columns: source blocks.
@@ -268,7 +283,7 @@ <h2>Interactive: Core AttnRes Operation</h2>
268283
<div class="result-card">
269284
<div class="result-card-header">Attention Distribution</div>
270285
<div class="result-card-body">
271-
<canvas id="canvas-bar" width="400" height="250"></canvas>
286+
<canvas id="canvas-bar" width="400" height="250" aria-label="Bar chart of attention weight distribution for the deepest sublayer"></canvas>
272287
</div>
273288
<div class="result-card-footer">
274289
At zero init, all sources receive weight 1/N (uniform). Training breaks this symmetry.
@@ -292,16 +307,16 @@ <h2>Training: Watching Patterns Emerge</h2>
292307

293308
<div class="training-panel">
294309
<div class="training-controls">
295-
<button class="btn btn-primary" id="btn-train-start" disabled>Start Training</button>
296-
<button class="btn" id="btn-train-reset" disabled>Reset</button>
297-
<div class="training-stats">
310+
<button class="btn btn-primary" id="btn-train-start" disabled aria-label="Start training simulation">Start Training</button>
311+
<button class="btn" id="btn-train-reset" disabled aria-label="Reset training to initial state">Reset</button>
312+
<div class="training-stats" role="group" aria-label="Training statistics">
298313
<div class="stat">
299314
<span class="stat-label">Step</span>
300-
<span class="stat-value" id="train-step">0</span>
315+
<span class="stat-value" id="train-step" aria-live="off">0</span>
301316
</div>
302317
<div class="stat">
303318
<span class="stat-label">Loss</span>
304-
<span class="stat-value" id="train-loss">&mdash;</span>
319+
<span class="stat-value" id="train-loss" aria-live="off">&mdash;</span>
305320
</div>
306321
</div>
307322
</div>
@@ -310,13 +325,15 @@ <h2>Training: Watching Patterns Emerge</h2>
310325
<div class="result-card result-card-wide">
311326
<div class="result-card-header">Loss Curve</div>
312327
<div class="result-card-body">
313-
<canvas id="canvas-loss" width="800" height="200"></canvas>
328+
<div class="canvas-empty-state" id="loss-empty">Initialize a model and start training to see the loss curve</div>
329+
<canvas id="canvas-loss" width="800" height="200" style="display:none" aria-label="Training loss curve over steps"></canvas>
314330
</div>
315331
</div>
316332
<div class="result-card result-card-wide">
317333
<div class="result-card-header">Depth Attention Heatmap (evolving)</div>
318334
<div class="result-card-body">
319-
<canvas id="canvas-train-heatmap" width="800" height="300"></canvas>
335+
<div class="canvas-empty-state" id="heatmap-empty">Depth attention patterns will appear here during training</div>
336+
<canvas id="canvas-train-heatmap" width="800" height="300" style="display:none" aria-label="Evolving depth attention heatmap during training"></canvas>
320337
</div>
321338
<div class="result-card-footer">
322339
Watch how later layers develop stronger selectivity over depth.
@@ -326,7 +343,8 @@ <h2>Training: Watching Patterns Emerge</h2>
326343
<div class="result-card result-card-wide">
327344
<div class="result-card-header">Pseudo-Query Norms ||w<sub>l</sub>||</div>
328345
<div class="result-card-body">
329-
<canvas id="canvas-norms" width="800" height="200"></canvas>
346+
<div class="canvas-empty-state" id="norms-empty">Pseudo-query norm evolution will appear here during training</div>
347+
<canvas id="canvas-norms" width="800" height="200" style="display:none" aria-label="Multi-line chart of pseudo-query norm evolution per sublayer"></canvas>
330348
</div>
331349
<div class="result-card-footer">
332350
The magnitude of each pseudo-query grows from zero during training.
@@ -347,8 +365,8 @@ <h2>Standard Residual vs. AttnRes</h2>
347365
<div class="comparison-grid">
348366
<div class="comparison-card">
349367
<h3>Standard Residual</h3>
350-
<div class="equation">h = h<sub>l</sub> + F(h<sub>l</sub>)</div>
351-
<canvas id="canvas-cmp-standard" width="300" height="200"></canvas>
368+
<div class="equation" role="math">h = h<sub>l</sub> + F(h<sub>l</sub>)</div>
369+
<canvas id="canvas-cmp-standard" width="300" height="200" aria-label="Bar chart showing uniform 0.25 weights for standard residual"></canvas>
352370
<ul>
353371
<li>Fixed weight = 1 per layer</li>
354372
<li>No selectivity over depth</li>
@@ -358,8 +376,8 @@ <h3>Standard Residual</h3>
358376
</div>
359377
<div class="comparison-card comparison-card-highlight">
360378
<h3>Attention Residual</h3>
361-
<div class="equation">h = &sum; &alpha;<sub>i</sub> &middot; b<sub>i</sub></div>
362-
<canvas id="canvas-cmp-attnres" width="300" height="200"></canvas>
379+
<div class="equation" role="math">h = &sum; &alpha;<sub>i</sub> &middot; b<sub>i</sub></div>
380+
<canvas id="canvas-cmp-attnres" width="300" height="200" aria-label="Bar chart showing learned non-uniform weights for attention residual"></canvas>
363381
<ul>
364382
<li>Learned weights via softmax</li>
365383
<li>Selective routing over depth</li>
@@ -393,6 +411,9 @@ <h3>Attention Residual</h3>
393411
</div>
394412
</footer>
395413

414+
<!-- Toast notification container -->
415+
<div class="toast-container" id="toast-container" aria-live="polite" aria-atomic="true"></div>
416+
396417
<script type="module" src="/src/main.ts"></script>
397418
</body>
398419
</html>

0 commit comments

Comments
 (0)