Project Introduction: Law-CN-AI
Overview
Law-CN-AI is a project designed to function as an AI-powered legal assistant. It sources legal documents from LawRefBook/Laws and utilizes a specialized project template available at supabase-community/nextjs-openai-doc-search. This initiative transforms Markdown files into a custom context for generating prompts using OpenAI's text completion technology.
Fun Features
The project offers more than just legal assistance, providing links to various interesting and useful tools:
- MagickPen: An intelligent writing assistant available at magickpen.com.
- TeachAnything: An AI encyclopedia accessible via teach-anything.com.
- BetterPrompt: A prompt generator found at better.avatarprompt.net.
- OpenL: An AI translation expert, which is part of the offerings at openl.io.
Deployment
Deploying Law-CN-AI is streamlined with Vercel. With the Supabase integration, essential environment variables will be set up automatically, alongside the configuration of your database schema. The only requirement is to configure your OPENAI_KEY
, and you're set to go.
A helpful tutorial on deployment and setup by GoJun can be accessed here.
Technical Details
Creating a customized ChatGPT using Law-CN-AI involves these steps:
- Preprocess the knowledge base (Markdown files in your
pages
folder). - Store embedded vectors within PostgreSQL using the pgvector extension.
- Perform a vector similarity search to find relevant content.
- Inject the content into OpenAI GPT-3 for text completion, streaming the response back to the user.
Build Time
At build time, the system preprocesses pages and creates embeddings. These embeddings are stored in a database powered by pgvector. Here is the process visualized:
sequenceDiagram
participant Vercel
participant DB (pgvector)
participant OpenAI (API)
loop 1. Preprocess Knowledge Base
Vercel->>Vercel: Split .mdx pages into sections
loop 2. Create and Store Embeddings
Vercel->>OpenAI (API): Create embeddings for page sections
OpenAI (API)->>Vercel: Embedding vectors (1536)
Vercel->>DB (pgvector): Store page section embeddings
end
end
Additionally, a checksum is generated to ensure embeddings are updated only when files have changed.
Runtime
During runtime, when a user makes a query, the following sequence occurs:
sequenceDiagram
participant Client
participant Edge Function
participant DB (pgvector)
participant OpenAI (API)
Client->>Edge Function: { query: lorem ipsum }
critical 3. Perform Vector Similarity Search
Edge Function->>OpenAI (API): Create embedding for query
OpenAI (API)->>Edge Function: Embedding vector (1536)
Edge Function->>DB (pgvector): Vector similarity search
DB (pgvector)->>Edge Function: Relevant document content
end
critical 4. Inject Content into Prompt
Edge Function->>OpenAI (API): Completion request with query + docs
OpenAI (API)-->>Client: text/event-stream: Completion response
end
Files handling these processes are SearchDialog (client-side)
and vector-search (edge function)
.
Local Development
To set up locally:
- Copy environment file:
cp .env.example .env
- Set
OPENAI_KEY
in the new.env
file.
Launch Supabase using Docker:
npx supabase start
Start the Next.js application:
pnpm dev
This comprehensive setup enables developers to run Law-CN-AI locally and explore its capabilities as an AI-powered legal assistant.