李映萱 / リインシュアン

Yingxuan LI

Machine Learning Engineer · Ph.D.

I am a Machine Learning Engineer at CyberAgent in Tokyo, Japan. I received my Ph.D. from the University of Tokyo under the supervision of Prof. Yusuke Matsui. My work focuses on computer vision, vision-language models, and multimodal learning, with a particular interest in comic/manga understanding.

Research Interests

Languages

I am currently open to ML Engineer, Research Engineer, and Research Scientist opportunities in computer vision and multimodal AI.

Picked Publications

Region-Wise Correspondence Prediction between Manga Line Art Images

Yingxuan Li, Jiafeng Mao, Qianru Qiu, and Yusuke Matsui

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

Paper Project

Noisy Label Refinement with Semantically Reliable Synthetic Images

Yingxuan Li, Jiafeng Mao, and Yusuke Matsui

IEEE International Conference on Image Processing (ICIP), 2025

Paper Project

Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion

Yingxuan Li, Ryota Hinami, Kiyoharu Aizawa, and Yusuke Matsui

ACM International Conference on Multimedia (ACM MM), Oral, 2024

Paper Project

Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection

Yingxuan Li, Kiyoharu Aizawa, and Yusuke Matsui

IEEE International Conference on Multimedia and Expo (ICME), Oral, 2024

Paper Project Dataset

Research & Industry Experience

CyberAgent, Inc.

Machine Learning Engineer · AI Localization Center

Apr 2026 - Present · Tokyo, Japan

Develop multimodal AI systems for manga understanding and localization
Design translation evaluation frameworks combining LLM-based evaluation with human review
Collaborate with researchers and engineers to bridge research prototypes and production systems

CyberAgent, Inc.

Research Intern · AI Lab

Feb 2025 - Jan 2026 · Tokyo, Japan

Conducted research on visual representation learning for structured image understanding
Designed a deep learning framework for region-wise correspondence prediction between manga line-art images
Published the resulting work at CVPR 2026

Mantra Inc.

Research Intern

Jun 2023 - Mar 2024 · Tokyo, Japan

Conducted research on multimodal reasoning and vision-language understanding
Developed a training-free method integrating visual, textual, and contextual information
Published the resulting work at ACM Multimedia 2024, where it was selected as an Oral paper

Education

The University of Tokyo

Ph.D. in Information Science and Technology

Advisor: Prof. Yusuke Matsui

Apr 2023 - Mar 2026 · Tokyo, Japan

The University of Tokyo

Master of Information Science and Technology

Advisor: Prof. Yusuke Matsui

Apr 2021 - Mar 2023 · Tokyo, Japan

Tianjin University

Bachelor of Engineering, Communication Engineering

Advisor: Prof. Yan Xu

Sep 2016 - Jun 2020 · Tianjin, China

Suzhou Yucai Scholarship, 2018
Tianjin People's Government Scholarship, 2017
Merit Student of Tianjin University, 2017, 2018
Excellent Communist Youth League Cadres of Tianjin University, 2017, 2018

Yokohama National University

Undergraduate Exchange Student

Advisor: Prof. Ryuji Kohno

Apr 2019 - Aug 2019 · Kanagawa, Japan

Other Publications

Region-Level Correspondence Prediction between Manga Line Art Images

Yingxuan Li, Qianru Qiu, Jiafeng Mao, and Yusuke Matsui

画像の認識・理解シンポジウム (MIRU), 2025

Noisy Label Refinement via Reference-Guided Synthetic Images

Yingxuan Li, Jiafeng Mao, and Yusuke Matsui

画像の認識・理解シンポジウム (MIRU), 2025

Iterative Multimodal Fusion を用いたゼロショットの漫画キャラクター識別と話者推定

李映萱, 日並遼太, 相澤清晴, 松井勇佑

言語処理若手シンポジウム (YANS), 2024

Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion

Yingxuan Li, Ryota Hinami, Kiyoharu Aizawa, and Yusuke Matsui

画像の認識・理解シンポジウム (MIRU), 2024

Manga109Dialog: 漫画話者推定のための大規模な対話データセット

李映萱, 相澤清晴, 松井勇佑

コミック工学研究会, 2023

Comic Speaker Detection and Estimation using Scene Graph

Yingxuan Li, Kiyoharu Aizawa, and Yusuke Matsui

パターン認識・メディア理解研究会 (PRMU), 2023

Speaker Detection of Comics using Scene Graph

Yingxuan Li and Yusuke Matsui

画像の認識・理解シンポジウム (MIRU), 2022

Invited Talks

大規模言語モデルとマルチモーダルデータの融合による漫画セリフの話者同定

コミック工学シンポジウム 2024

Nov 2024

Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection

HUAWEI TOKYO RESEARCH CENTER Discussion Dinner in MIRU 2023

Jul 2023