redis-arXiv-search - AI-Powered Semantic Search for Scientific Documents Using Redis

Redis arXiv Search: A Comprehensive Introduction

Redis arXiv Search is a project designed to provide an intuitive search experience for academic papers using Redis, a high-performance vector database. The project is a showcase of Redis's capabilities in vector searching, particularly in the context of document retrieval, utilizing a dataset from arXiv, a prominent repository for research papers.

Key Features and Components

Redis arXiv Search is structured as a Single Page Application (SPA), employing a variety of technologies to deliver seamless and efficient search functionalities.

Vector Database: At the heart of the application is Redis Stack, which organizes and manages data efficiently.
Backend and API: FastAPI supports the backend to handle requests, and RedisVL acts as the Python client for interfacing with the vector database. Pydantic ensures data is structured and validated accurately within this framework.
Frontend Development: Developed using React with Typescript, the frontend is responsible for the user interface, providing a smooth and interactive experience. MaterialUI and React-Bootstrap are utilized to improve UI design and functionality.
Containerization and Deployment: Docker Compose simplifies development by managing dependencies and streamlining deployment. This ensures the application runs consistently across different environments.
Vector Embeddings: The project leverages embeddings generated using state-of-the-art models from HuggingFace, OpenAI, and Cohere. These embeddings capture the semantic essence of text, facilitating effective document searches.

Project Structure

The project is organized into several directories tailored to handle distinct responsibilities.

Backend: This section incorporates logic for searching papers, managing database interactions, data serialization, and API testing.
Frontend: This segment hosts configurations, styles, and components that constitute the user interface, ensuring an engaging experience.
Data Management: The 'data' directory manages initial data loading to ensure the application is ready from the get-go.

How to Run the Redis arXiv Search Application

For those looking to operate this application, here’s a step-by-step guide:

Prepare Your Environment: Begin by installing Docker Desktop for environment consistency.
Get the Code: Clone the repository from GitHub to set up the codebase locally.
```
$ git clone https://github.com/RedisVentures/redis-arXiv-search.git
```
Configure Environment: Copy the .env.template file to .env and add API keys for OpenAI and Cohere as directed.
Redis Deployment Options: Choose between running Redis locally using Docker or leveraging Redis Cloud, which provides managed cloud services.
- For local deployment, utilize Docker:
```
$ docker compose -f docker-local-redis.yml up
```
- For cloud deployment, adjust your .env file with Redis Cloud credentials and run:
```
$ docker compose -f docker-cloud-redis.yml up
```

Customization and Development

Redis arXiv Search is designed to be flexible. Developers can run local Redis instances, use FastAPI with Poetry for a localized backend setup, and engage with React’s interactive environment to tweak the frontend.

Backend Customization: FastAPI provides a robust platform for backend tasks, utilizing poetry for managing dependencies and running locally.
Frontend Development: Using npm, developers can serve the frontend and observe changes in real-time, enhancing the iterative development process.

Troubleshooting and Support

Occasionally, you may need to clear Docker cached files, which can be resolved by executing docker system prune, then restarting Docker Desktop.

Open-source by nature, Redis maintains this project with community involvement encouraged. Issues can be reported on GitHub for support and enhancements.

Redis arXiv Search demonstrates the power of AI-driven document search using Redis’s vector capabilities, offering a platform to retrieve academic papers efficiently and effectively.