Introduction to ClassGPT
ClassGPT is an innovative tool designed to transform lecture slides into interactive content, taking inspiration from the potential of ChatGPT. The project leverages the power of AI and advanced technologies like Streamlit, LlamaIndex, and LangChain, integrated with the latest ChatGPT API from OpenAI, to provide users with a seamless experience in understanding and interacting with lecture materials.
Project Overview
Inspired by the concept of AthensGPT, ClassGPT aims to enhance educational materials by converting lecture slides into a conversational interface. This allows users to interact with, question, and delve deeper into lecture content through the use of advanced AI models.
How ClassGPT Works
ClassGPT operates through a systematic process that begins with the parsing of PDF files. Here's a step-by-step breakdown of how it works:
-
PDF Parsing: Utilizing the
pypdf
library, ClassGPT extracts text content from PDF lecture slides. -
Index Construction: Using LlamaIndex's
GPTSimpleVectorIndex
, the extracted text is processed to create embeddings with thetext-embedding-ada-002
model. This step is essential for enabling the efficient search and retrieval of relevant information. -
Storage: The system stores indexes and files on Amazon S3, a cloud storage service, ensuring easy access and management.
-
Querying: Users interact with ClassGPT by querying the index, which leverages the latest
gpt-3.5-turbo
model to provide accurate and contextually relevant answers.
Usage Instructions
To get started with ClassGPT, the following steps are necessary:
Configuration and Secrets
- Set up AWS by following the quickstart guide.
- Create an S3 bucket with a custom name and modify the recognition code to reflect this.
- Update the
.env
file with your OpenAI credentials.
Running Locally
-
Set Up a Python Environment:
conda create -n classgpt python=3.9 conda activate classgpt
-
Install the Necessary Dependencies:
pip install -r requirements.txt
-
Launch the Streamlit Application:
cd app/ streamlit run app/01_❓_Ask.py
Using Docker
Alternatively, you can deploy ClassGPT using Docker:
docker compose up
Access the application by navigating to http://localhost:8501/ in your web browser.
Future Enhancements
ClassGPT is an evolving project with plans for additional features including:
- Implementing a local mode that does not require S3 for storage.
- Deploying the app to Streamlit Cloud with enhanced settings for customization.
- Supporting multiple file queries and composing indices from different lectures.
FAQs
Tokens and Embeddings
Tokens are subcomponents of written language that the AI model processes. As for embeddings, these are numerical vectors that represent text in a way that captures their semantic meaning, helping measure the relatedness of text elements. The cost associated with using the text-embedding-ada-002
and gpt-3.5-turbo
models is structured per 1,000 tokens processed.
Conclusion
ClassGPT represents a significant advancement in educational technology, converting static lecture slides into dynamic, interactive content. With the combination of cutting-edge AI models and user-friendly interfaces, ClassGPT is set to become a valuable tool for educators and students alike.