Introduction to Album AI
Album AI is an innovative project designed to transform how we interact with photo albums. It's not just another photo management tool; it's a groundbreaking blend of artificial intelligence and traditional albums that allows users to converse with their image collections using natural language. The project is powered by cutting-edge technology, including the gpt-4o-mini and Haiku visual models, to automatically extract metadata from images and facilitate interactive conversations through RAG technology.
The Origin of Album AI
The creation of Album AI stems from the challenges many photography enthusiasts face—the management of terabytes of images. Traditional photo management software often demands significant effort to maintain. However, the emergence of Haiku and gpt-4o-mini offered a promising solution. In less than 24 hours, the first version of Album AI was developed by a dedicated team, eager to simplify the photo organization process.
Features of Album AI
The Album AI project is packed with features to enhance usability and provide exceptional functionality:
- Automatic Image Discovery: Seamlessly finds images stored in a PgSQL database.
- Metadata Generation: Utilizes GPT-4-o-mini to create metadata for images, ensuring accurate and detailed descriptions.
- Metadata Vectorization: Employs OpenAI's Embedding API for efficient metadata processing.
- APIs Available:
- Search API: Delivers relevant images based on user queries.
- Chat API: Leverages RAG to retrieve images and generate insightful responses.
- Simple Deployment: Facilitates one-click deployment on platforms that support Docker containers like Render.
- Open-Source Licensing: Offers a permissive license that allows for adaptation and integration, although commercial use requires permission.
Getting Started with Album AI
To explore Album AI's capabilities, users are encouraged to start by running it locally. Here’s a quick guide on how to initiate the project:
-
Clone the Project: Use Git to clone the Album AI repository and switch to the project directory.
git clone [email protected]:gcui-art/album-ai.git cd album-ai
-
Configure Environment Variables: Modify the
.env.prod
file to set up your local and proxy IP addresses along with your OpenAI and Anthropic API keys. -
Build and Run: Execute the following commands to build and run the project.
chmod a+x ./build.sh ./build.sh
-
Access the Demo: Launch your web browser and navigate to
http://localhost:8080
to start using Album AI. -
Add Photos: Add images to the
images
directory, which will be automatically processed for metadata recognition and vectorization, enabling search and chat functionality.
API Reference
Album AI supports two primary APIs:
GET
/api/v1/file/search: Allows image searching based on user input queries.POST
/api/v1/chat: Facilitates interactive chat sessions with image data.
Contributing to Album AI
There are several ways to contribute to the development and improvement of Album AI:
- Contribute Code: Fork the project on GitHub and submit pull requests with enhancements or bug fixes.
- Report Issues: Share your feedback, suggestions, or any bugs you encounter by submitting issues on the GitHub repository.
- Spread the Word: Star the project on GitHub and recommend it to others to increase visibility.
License and Commercial Use
Album AI is released under the Apache 2.0 License, allowing for wide use and modification. For commercial applications, interested parties are requested to reach out to the developers for further discussion.
Engage with Album AI
For those intrigued by the potential of Album AI, visit the live demo or explore the project's GitHub repository to learn more and get involved.