Considering deep neural networks as manifold mappers, the pretrain-then-fine-tune paradigm can be interpreted as a two-stage process: pretrain establishes a broad knowledge base, and fine-tune adjusts the model parameters to activate specific neural pathways to align with the target manifold. Although prior fine-tuning approaches demonstrate success, their rigid parameter space limits their ability to dynamically activate appropriate neural pathways, rendering them ill-equipped to adapt flexibly to the diverse and evolving data distributions.
In light of this view, we propose a novel approach, Mixture of Expert Prompt Tuning (MEPT), as an effective and efficient manifold-mapping framework. MEPT leverages the Mixture of Experts architecture by integrating multiple prompt experts to adaptively learn diverse and non-stationary data distributions.
Empirical evaluations demonstrate that MEPT outperforms several state-of-the-art parameter efficient baselines on SuperGLUE, achieving notable improvements in mean accuracy (e.g., 1.94%) while significantly reducing activated prompts by 79.25%. The effectiveness of MEPT is further supported by theoretical insights from manifold learning and validated through neural activation pathway visualization results.
P.S. The video and slides are on the way!!
@inproceedings{zeng2025mept,
title={MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper},
author={Zeng, Runjia and Sun, Guangyan and Wang, Qifan and Geng, Tong and Dianat, Sohail and Han, Xiaotian and Rao, Raghuveer and Zhang, Xueling and Han, Cheng and Huang, Lifu and others},
booktitle={EMNLP},
year={2025}
}