Seungwoo Son

Email | Scholar | LinkedIn

Seungwoo Son

Machine Learning Engineer @ Samsung Research

I am a Machine Learning Engineer at Samsung Research (AI System Team). Previously, I worked at Google (CoreML Team) as a Student Researcher Intern.

My research interests lie in Model Compression and On-Device Personalization. Recently, I have been exploring ways to internalize retrieval augmented generation (e.g., GraphRAG) on edge devices for personalized AI. I have experience developing quantization methods that significantly reduce model size and latency while maintaining accuracy.

Work Experience

Machine Learning Engineer, Samsung Research (AI System Team) Oct. 2024 - Present
Developing quantized models for Galaxy edge devices. Reduced model size by 75% and latency by 30%.
Student Researcher Intern, Google (CoreML Team) Aug. 2023 - Jul. 2024
Implemented advanced quantization methods for LLMs, achieving 50% improvement in zero-shot accuracy.
Graduate Research Assistant, POSTECH Mar. 2022 - Jul. 2024
Researched neural network compression techniques (KD, Quantization).

Education

Pohang University of Science and Technology (POSTECH) Mar. 2022 - Aug. 2024
M.S. in Electrical Engineering
Inha University Mar. 2016 - Feb. 2022
B.S. in Electronic Engineering (Total GPA: 4.33/4.5, Major GPA: 4.4/4.5, Summa Cum Laude)

Publications

TurboBoA: Faster and Exact Attention Aware Quantization without Backpropagation

Junhan Kim, Yeo Jeong Park, Seungwoo Son, Chungman Lee, Ho-young Kim, Joonyoung Kim, Yongkweon Jeon | ICLR 2026 (Submitted)

Proposed a backpropagation free quantization algorithm that achieves 4x speedup over state of the art methods by jointly quantizing multiple out channels and correcting propagated distortions, delivering superior accuracy in low bit regimes.

Work done at Samsung Research

On the Importance of a Multiscale Calibration for Quantization

Seungwoo Son, Junhan Kim, Ingyu Seong, Hyemi Jang, Yongkweon Jeon | ICASSP 2026 (Submitted)

Introduced MaCa, a length aware calibration method that incorporates multiscale sequence length information into Hessian estimation to improve quantization accuracy for variable length inputs in LLMs.

Work done at Samsung Research

Two Stage Grid Optimization for Groupwise Quantization of LLMs

Junhan Kim, Seungwoo Son, Jeewook Kim, Gukryeol Lee, Yongkweon Jeon | ICASSP 2026 (Submitted)

Developed a two stage optimization strategy for groupwise quantization that initializes group scales based on input statistics and refines them via closed form coordinate descent, minimizing layerwise reconstruction loss efficiently.

Work done at Samsung Research

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization

Seungwoo Son, Wonpyo Park, Woohyun Han, Kyuyeun Kim, Jaeho Lee | EMNLP 2024

Revealed that prepending attention sink tokens mitigates activation outliers in LLMs by absorbing massive attention scores, enabling effective activation quantization.

[Paper] Work done at Google

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers

Seungwoo Son, Jegwang Ryu, Namhoon Lee, Jaeho Lee | ECCV 2024, ICLR 2023 Workshop on Sparsity in Neural Networks

Developed a cost efficient distillation framework for Vision Transformers by masking input tokens to the teacher.

[Paper] [Code] Work done at POSTECH

DSP: Distill The Knowledge Only By A Subset of Patches

Seungwoo Son, Jaeho Lee | IPIU 2023 (Oral)

Investigated methodology to efficiently extract model knowledge using only a subset of image patches.

[Link] Best Paper Award Work done at POSTECH

Invited Talks

Naver-Intel Joint Lab Workshop: Lightweighting for Hyperscale AI, Jun. 2024
Conference Info

Academic Services

Reviewer: ACL 2026, EACL 2026, ACL 2025, EMNLP 2025

Honors & Awards

Best M.S. Dissertation Award, POSTECH (Feb. 2025)
IPIU Best Paper Award, Korea Computer Vision Society (Feb. 2023)
National Science and Engineering Undergraduate Scholarship, Ministry of Science and ICT (Mar. 2020)

Technical Skills

Languages: C/C++, Python
Frameworks & Tools: PyTorch, Jax, Git, Overleaf