|
Research
My current research focuses on multimodal large language models, with a particular emphasis on improving their perceptual and spatial reasoning capabilities.
I am also interested in developing efficient online video models and enhancing the applicability of vision-language models in real-world scenarios.
Previously, my research centered on geometric graph neural networks, especially for simulating complex physical systems.
Representative papers are highlighted.
|
Last updated: April 26, 2026
Template from Jon Barron. Big thanks!
|
|