chatbot - Customizable Chinese Chatbot Featuring GPT Integration and Multimodal Interaction

Chatbot Project Overview

In recent times, the popularity of chatbots has surged, particularly driven by the prominence of ChatGPT, which has set a trend towards adopting GPT-style models. In response to these industry shifts, this chatbot project is set to update with a GPT-based version soon. It offers a platform for developing a Chinese chatbot using customizable datasets. The project encourages community participation, allowing enthusiasts to engage, share insights, and contribute through Star or Fork on its repository.

Seq2Seq Model Example

The project currently features a Seq2Seq-based model. While still a work in progress (at 50% training completion), sample images depict the model's current capabilities and progress.

Roadmap for Project Development

V1.1: Scheduled for Update on 2024-09-30

MindSpore Integration: This update will prioritize the introduction of the GPT model and Reinforcement Learning from Human Feedback (RLHF) into the MindSpore version of the project.
Architectural Expansion: The overall project architecture will be divided into Seq2Seq and GPT branches, to continue evolving with multiple AI framework versions.

V1.2: Tentative Update by 2024-12-30

Mini-GPT4 Features: This version aims to incorporate multi-modal dialogue capabilities, combining text and images to enhance the chatbot's interactivity and richness.
Enhanced Training Capabilities: Improvements are intended for distributed cluster training and RLHF features, supporting more scalable and efficient training processes.

How to Execute the Seq2Seq Version

For those interested in working with the Seq2Seq version, a basic execution guide is as follows:

Data Preparation: After downloading the code and sample data (such as the "Xiaohuangji" dataset available on GitHub), place the data file into the train_data directory. Hyperparameters can be set within the config/seq2seq.ini file.
Execution Sequence: Follow the order by executing the data preprocessor (data_utls.py), the main execution script (execute.py), and finally the visualization module (app.py).
Distributed Training: For large-scale distributed training, refer to Horovod's launching commands: horovodrun -np n -H host1_ip:port,host2_ip:port,...,hostn_ip:port python3 execute.py.

Recommended Training Environment

To successfully train the model, the following environment setup is recommended:

OS: Ubuntu 18.04
Python: Version 3.6

For TensorFlow 2.x:

TensorFlow: 2.6.0
Flask: 0.11.1
Horovod: 0.24 (for distributed training)

For PyTorch:

Torch: 1.11.0
Flask: 0.11.1

Community Engagement and Contact

This project thrives on open-source development and community interaction. Those interested can join the conversation through QQ: 934389697 for inquiries and further collaboration.