#ModelScope
LangChain-ChatGLM-Webui
The LangChain-ChatGLM-Webui project provides a WebUI utilizing LangChain and ChatGLM-6B series models for applications grounded in local knowledge. Supporting multiple text file formats like txt, docx, md, and pdf, it includes models such as ChatGLM-6B and Belle for enhanced embedding functionalities. Designed for real-world AI model implementation, the project offers online accessibility through HuggingFace, ModelScope, and AIStudio. Compatible with Python 3.8.1+, it facilitates straightforward deployment. Continuous updates and community engagement ensure its dynamic advancement, inviting developer participation without exaggerated claims.
AdaSeq
AdaSeq provides a versatile library for creating high-efficiency models in sequence understanding. Leveraging ModelScope, it facilitates tasks such as POS tagging, named entity recognition, and relation extraction. Highlights include a variety of sophisticated models, ease of use, and flexibility. Recent accolades for multilingual named entity recognition tasks underscore its credibility. Designed for both researchers and developers, AdaSeq constantly evolves to maintain state-of-the-art performance without promotional exaggeration.
llama3-chinese
This project leverages high-quality Chinese and English multi-turn SFT data using the DORA and LORA+ methodologies to improve natural language processing capabilities. It provides downloadable models, instructions for merging LORA models, and supports deployment and inference. The focus is on flexible usage and robust performance in Chinese NLP, making it a valuable resource for research applications without exaggeration.
Multi-LLM-Agent
α-UMi introduces an innovative, open-source method for tool learning by allowing small LLMs to effectively collaborate, surpassing the performance of larger closed-source LLMs. The system categorizes capabilities into planner, caller, and summarizer roles, each enhancing tool interaction and user response formation. With adaptable prompt design, the Multi-LLM agent employs a distinctive two-stage Global-to-Local Progressive Fine-tuning (GLPFT) for enhanced training. Renowned for its operational flexibility and advanced collaboration, α-UMi provides processed data for easy adoption, showcasing outstanding performance in both static and real-time evaluations.
modelscope
ModelScope is a 'Model-as-a-Service' platform that simplifies the integration and deployment of machine learning models, including those in computer vision (CV), natural language processing (NLP), and multi-modality. It provides API abstractions for easy model inference, fine-tuning, and evaluation with minimal coding. The platform includes customization options for various model components and seamless interaction with Model-Hub and Dataset-Hub. ModelScope hosts a wide range of models, accessible online, and supports modular and distributed training strategies to enrich AI development.
motionagent
MotionAgent is a deep learning application that converts user-generated scripts into videos, utilizing the open-source ModelScope model community. It features script creation with large language models like Qwen-7B-Chat, high-resolution video production from images, and personalized background music. Compatible with Python 3.8, torch 2.0.1, and CUDA 11.7 on Ubuntu 20.04, it requires 36GB GPU memory.
evalscope
Discover a comprehensive framework designed for evaluating and benchmarking diverse AI models, including large language models and multimodal variants. EvalScope provides end-to-end evaluation capabilities, supports custom datasets through user-friendly interfaces, and integrates with the ms-swift framework. Access a variety of evaluation backends such as OpenCompass and VLMEvalKit for in-depth analysis and performance stress testing, enabling precise model assessments with detailed reports and visualization support.
ChatPLUG
ChatPLUG is a Chinese open-domain dialogue system project emphasizing knowledge integration and personalization. It enables customization of conversation style through bot profiles and role-play instructions, supports multi-tasking in NLP, and facilitates robust multi-turn conversations. Compatible with ModelScope, HuggingFace, and XDPX for deployment, it integrates external knowledge during inference for enhanced versatility.
AMchat
AMchat is developed to address advanced mathematics challenges with a broad dataset comprising math problems and their solutions. Utilizing InternLM2-Math-7B as its foundation and fine-tuned through XTuner, the model exhibits strong capabilities in handling complex mathematical equations. Deployment is flexible, allowing usage via Docker, OpenXLab, or local installations. With the introduction of the Q8_0 quantized model version, it offers better performance. Various deployment options ensure wide accessibility, optimizing precision in mathematical problem-solving across different applications.
3D-Speaker
Discover an open-source platform designed for single- and multi-modal speaker verification, recognition, and diarization. Benefit from pretrained models on ModelScope and utilize the large-scale 3D-Speaker speech corpus for research in speech representation. This toolkit includes multiple training and inference recipes for datasets such as 3D-Speaker, VoxCeleb, and CN-Celeb, featuring models like CAM++, ERes2Net, ERes2NetV2, and ECAPA-TDNN. Keep updated with regular releases and comprehensive documentation, making it a valuable resource for researchers and developers in speech technology.
mPLUG-DocOwl
The mPLUG-DocOwl project from Alibaba offers advanced multimodal language models tailored for understanding documents without relying on OCR. This suite features cutting-edge components such as DocOwl2, TinyChart, and UReader, designed to improve processing of multi-page documents and charts. The focus is on high-resolution compression and integrated structure learning for both scientific and general document analysis. With accessible demos and resources on platforms like HuggingFace and ModelScope, seamless integration into applications is facilitated, continuously advancing document comprehension.
sd-webui-text2video
The sd-webui-text2video extension implements state-of-the-art text-to-video models like ModelScope and VideoCrafter within AUTOMATIC1111's StableDiffusion WebUI without requiring logins. It supports LoRA training and offers features like in-painting and video looping, enabling efficient animation creation with minimal VRAM usage. Key updates feature Torch2 optimizations for extended video output on constrained VRAM and the introduction of WebAPI. Discover usage examples and fine-tuning options with leading models, enhancing creativity for video synthesis aficionados.
MimicBrush
MimicBrush is a zero-shot image editing project utilizing reference imitation, developed by researchers such as Xi Chen and Yutong Feng. This project features methods for editing images without prior datasets, differing from conventional approaches. Comprehensive guides and downloadable checkpoints are available on HuggingFace and ModelScope, with both local and online Gradio demos for user interaction. This technique facilitates texture transfer and complex edits, paving the way for advancements in digital image processing.
modelscope-classroom
Access detailed tutorials covering AI training, inference, deployment, and application development. Stay informed with guides on dataset training, OpenAI-O1 inference, and more. Explore advanced resources on Modelscope-Agent, SD-AIGC, and diffusion technologies.
Feedback Email: [email protected]