content-chatbot - Convert Website Content into an AI-Powered Q&A Chatbot Using Langchain

Transforming Website Content into an Interactive Chatbot: A Project Overview

The "content-chatbot" project offers an innovative way to turn any website into a question-answering bot or an engaging interactive chatbot. Leveraging the power of the langchain tool, which integrates with the OpenAI API, this project empowers users to convert website content into meaningful conversations and answers, complete with document sources.

Core Components

The project is organized into three core scripts, each serving a pivotal function in transforming website data into a chatbot:

create_embeddings.py: This is the foundation script that navigates your website's sitemap.xml. It generates embeddings, essentially vectors capturing the semantics of your website's content. This is crucial for making sense of the context and meaning within the data.
ask_question.py: Armed with the embeddings created by the previous script, users can directly ask questions. This script not only provides relevant answers but also points to the specific URLs on your website that served as information sources.
start_chat_app.py: This script launches a chat interface where users can ask questions and receive answers in real time. The interface is designed to support follow-up questions and will notify users if the bot is unsure about the response. It can be tailored to focus on particular topics, such as machine learning or other technical subjects.

Setting Up and Using the Project

Creating Embeddings

The first step in employing this project is to create embeddings. This requires an OpenAI API key, which can be set in your terminal before running the script. The embeddings are stored in a file named faiss_store.pkl. Users point the script to their website's sitemap and can filter which pages to include or exclude.

For comprehensive details on this process, users are encouraged to explore an insightful blog post.

Answering Questions with Source Documentation

Once embeddings are prepared, users can pose questions through the ask_question.py script. It identifies the closest content matches from the embeddings and utilizes GPT-3 for generating answers. Notably, the script also lists sources from your website, providing transparency in how the answer was derived.

Launching the Chatbot

The final step is to run the start_chat_app.py script, which kicks off a conversational interface. Users can engage with the bot, asking questions and receiving answers derived from the website's content.

Enhancements for Zendesk Content

Building on the initial capabilities, the project incorporates a feature for creating embeddings for Zendesk content, improving chatbot responses through the use of the Zendesk API. This includes:

Integration with Zendesk API: Retrieving website contents.
Cleaning and Preparation: Extracting and organizing text for subsequent processing.
Embedding and Storage: Creating and storing text embeddings in a Faiss knowledge base.

This enhancement enables a more refined and efficient similarity search, offering more precise chatbot interactions.

How to Start

To utilize the Zendesk content feature, users must first set their Zendesk API credentials. Subsequently, they can run the create_embeddings.py with the appropriate parameters to build a Faiss store from Zendesk data.

In summary, the "content-chatbot" project is a powerful tool for transforming website content into dynamic, source-backed conversations, capable of enhancing user engagement in novel ways.