Email: 1229323056@qq.com; yfcui@baai.ac.cm
Tel: +86 18800133341
I am a researcher at Beijing Academy of Artificial Intelligence (BAAI).
* Equal Contribution
Xinlong Wang*, Xiaosong Zhang*, Zhengxiong Luo*, Quan Sun*, Yufeng Cui*, Jinsheng Wang*, Fan Zhang*, Yueze Wang*, Zhen Li*, Qiying Yu, Yingli Zhao, Yulong Ao, Xuebin Min, Tao Li, Boya Wu, Bo Zhao, Bowen Zhang, Liangdong Wang, Guang Liu, Zheqi He, Xi Yang, Jingjing Liu, Yonghua Lin, Tiejun Huang, Zhongyuan Wang. Emu3: Next-token prediction is all you need [code] [project]
Haiwen Diao*, Yufeng Cui*, Xiaotong Li, Yueze Wang, Huchuan Lu, Xinlong Wang. EVE: Unveiling Encoder-Free Vision-Language Models. NIPS 2024. [code]
Quan Sun*, Yufeng Cui*, Xiaosong Zhang*, Fan Zhang*, Qiying Yu*, Zhengxiong Luo, Yueze Wang, Yongming Rao, Jingjing Liu, Tiejun Huang, Xinlong Wang. Emu2: Generative Multimodal Models are In-Context Learners. CVPR 2024. [code] [project] [demo]
Quan Sun*, Qiying Yu*, Yufeng Cui*, Fan Zhang*, Xiaosong Zhang*, Yueze Wang, Hongcheng Gao, Jingjing Liu, Tiejun Huang, Xinlong Wang. Emu: Generative Pretraining in Multimodality. ICLR 2024. [code]
Yufeng Cui*, Lichen Zhao*, Feng Liang*, Yangguang Li, Jing Shao. Democratizing contrastive language-image pre-training: A clip benchmark of data, model, and supervision. ICML Workshop. [code]
Yufeng Cui, Yimei Kang. Multi-modal gait recognition via effective spatial-temporal feature fusion. CVPR 2023.
Yufeng Cui, Yimei Kang. GaitTransformer: Multiple-temporal-scale transformer for cross-view gait recognition. ICME 2022.
Shuai Wang*, Yufeng Cui*, Yimei Kang. Learning Multiple Granularity Features for Unsupervised Person Re-Identification. ICME 2022.
Quan Sun*, Jinsheng Wang*, Qiying Yu*, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Xinlong Wang. EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters. arXiv:2402.04252.
Qiying Yu*, Quan Sun*, Xiaosong Zhang, Yufeng Cui, Fan Zhang, Yue Cao, Xinlong Wang, Jingjing Liu. CapsFusion: Rethinking Image-text Data at Scale. CVPR 2024. [code&data]
Yangguang Li*, Feng Liang*, Lichen Zhao*, Yufeng Cui, Wanli Ouyang, Jing Shao, Fengwei Yu, Junjie Yan. Supervision exists everywhere: A data efficient contrastive language-image pre-training paradigm. ICLR 2022. [code]
Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen, Fenggang Liu, Enze Xie, Lu Sheng, Wanli Ouyang, Jing Shao. Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline. TPAMI 2023. [code]