-
Jun 6, 2026
[ICML 2026] Rule2DRC
We introduce Rule2DRC, a large open benchmark for translating natural-language chip design rules into executable DRC scripts, graded by execution on held-out layouts, and SplitTester, an execution-guided test generation agent that improves Best-of-N selection.
-
Jun 27, 2025
[Misc] Vibe coding Talk-to-President service with Lovable AI
A quick retrospective on building a small AI personal Chatbot with voice cloning servbice in just a few hours using Lovable AI.
-
May 25, 2025
[ICML 2025] GuidedQuant
We propose GuidedQuant, a novel quantization approach that integrates gradient information from the end loss into the layer-wise quantization objective. Additionally, we introduce LNQ, a non-uniform scalar quantization algorithm which is guaranteed to monotonically decrease the quantization objective value.
-
Mar 11, 2025
[Review] Multi-head Latent Attention
This post reviews Multi-head Self-attention (MHA), Group Query Attention (GQA), and Multi-head Latent Attention (MLA).
-
Jul 11, 2024
[ICML 2024] LayerMerge
We propose LayerMerge, a novel depth compression method that selects which activation layers and convolution layers to remove, to achieve a desired inference speed-up while minimizing performance loss.
-
Sep 11, 2023
[ICML 2023] Efficient CNN Depth Compression
We propose a subset selection problem that replaces inefficient activation layers with identity functions and optimally merges consecutive convolution operations into shallow equivalent convolution operations for efficient inference latency.