WebcamGPT-Vision: Bringing AI-powered Image Processing to Your Webcam
WebcamGPT-Vision is an innovative web application that seamlessly integrates with your webcam to analyze images using OpenAI's cutting-edge GPT-4 Vision API. By capturing real-time images through your webcam, the application sends them to the GPT-4 Vision API, which processes the images and provides descriptive results. This project is available in three versions: PHP, Node.js, and Python/Flask, catering to different user preferences and technical setups.
Key Features
- Webcam Integration: The application connects directly to your webcam for live image capture, allowing a swift and interactive experience.
- AI Image Processing: Once an image is captured, it's processed using the sophisticated capabilities of the GPT-4 Vision API, which interprets and describes the image content.
- User-friendly Interface: Designed with simplicity in mind, the interface ensures ease of use even for individuals without a technical background.
Prerequisites
To get started with WebcamGPT-Vision, ensure you meet the following requirements:
- A modern web browser is necessary to run the application.
- Depending on the version you choose:
- For PHP: A server with PHP support and cURL enabled is required.
- For Node.js: You need to have Node.js and npm installed on your system.
- For Python/Flask: Python and Flask need to be installed.
- An API key from OpenAI is essential to access the GPT-4 Vision API features.
Installation Guide
Installation varies based on the version chosen. Here’s a quick overview:
PHP Version
- Clone the project using Git:
git clone https://github.com/bdekraker/webcamgpt-vision.git
- Access the
php-version
directory. - Insert your OpenAI API key into the
process_image.php
file. - Upload the project to a PHP-enabled server.
- Open
index.html
in your browser to begin.
Node.js Version
- Clone the project:
git clone https://github.com/bdekraker/webcamgpt-vision.git
- Navigate to the
js-version
directory and run:npm install
- Add your API key to a
.env
file within the same directory:OPENAI_API_KEY=YOUR_DEFAULT_API_KEY
- Launch the server with:
node server.js
- Use your browser to visit
http://localhost:3000
.
Python/Flask Version
- Clone the repository:
git clone https://github.com/bdekraker/webcamgpt-vision.git
- Enter the
python-version
directory. - Install dependencies with:
pip install -r requirements.txt
- Set the API key as an environment variable:
export YOUR_DEFAULT_API_KEY='your_actual_api_key_here'
- Start the server using:
python process_image.py
- Open
http://localhost:5000
in your browser.
How to Use
For all versions, the procedure is straightforward:
- Make sure your webcam is connected and accessible by your browser.
- Navigate to
index.html
using your web browser. - Capture an image by clicking the "Capture" button.
- The application processes the image and provides a description beneath the webcam feed.
Contributing to the Project
Contributions are encouraged and valued. If you're interested in contributing:
- Fork the repository and initiate a new branch for your changes.
- Write clear and informative commit messages.
- Follow the existing code style for consistency.
- Submit a pull request with a detailed explanation of your enhancements or fixes.
Seeking Support
For any questions or feedback, open an issue in the repository, and the team will respond accordingly.
Acknowledgements
Gratitude goes to OpenAI for the GPT-4 Vision API, which this project builds upon to offer advanced AI image processing functionalities. Inspired by the evolving capabilities of image understanding via AI, this project strives to make advanced technology accessible to more users.
Contact and Disclaimer
For inquiries, reach out to the project maintainer, Benjamin De Kraker, via Twitter. Note that WebcamGPT-Vision is not officially affiliated with OpenAI, and users must adhere to OpenAI's guidelines when using the GPT-4 Vision API.