Selected Publications


Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
SIGGRAPH Asia, 2024.
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
arXive, 2024.
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
arXive, 2024.
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment
ACL 2024 Main Conference.
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
ICML, 2024.
LMEye: An Interactive Perception Network for Large Language Models
IEEE Transactions on Multimedia (TMM), 2024.
A Multimodal In-Context Tuning Approach for E-Commerce Product Description Generation
LREC-COLING, 2024.
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
arXive, 2023.
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering
Technical Paper, 2023.
Training Multimedia Event Extraction With Generated Images and Captions
ACM on Multimedia (ACM MM), 2023.
A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text
ACL 2023 Main Conference.
A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
ACL 2023 Main Conference.
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations
ACM on Multimedia (ACM MM), 2022.
Medical Dialogue Response Generation with Pivotal Information Recalling
SIGKDD, 2022.
Fast and Robust Online Handwritten Chinese Character Recognition with Deep Spatial & Contextual Information Fusion Network
IEEE Transactions on Multimedia (TMM), 2022.

Research Blog


Training LLMs Towards Holistic Learning
Github, June, 2023.
Training Language Models From Fragmentation Learning To Holistic Learning.

Service

Conference Reviewer: ACL ARR (2023-), ACM MM (2023-), ICLR (2023-), NeurIPS (2024-), and IJCAI (2023-).
Journal Reviewer: IEEE TMM, IEEE TNNLS, IEEE TCSVT, IEEE TAI, and Neural Networks.