MPP-LLaVA
This project enables exploration into advanced multimodal communication and processing, supporting image and video dialogues. It leverages QwenLM for seamless multi-round conversations, offering efficient solutions for complex interactions via pipeline and model parallelism. The framework is optimized for training and inference on multiple GPUs with DeepSpeed implementations and provides open-source pre-trained and SFT weights for diverse AI applications.