Private Detector Project Introduction
Overview
Bumble has released an innovative open-source project named the Private Detector. This project centers around an image classification model engineered to recognize lewd images. By sharing their Private Detector model with the public, Bumble allows the wider community to use, analyze, and enhance the technology according to their needs. The model is accessible for download and includes various implementations such as a pretrained SavedModel, and a Frozen Model available directly from the repository.
Model Details
The Private Detector model is an advanced tool built utilizing the EfficientNet-v2 architecture. It has been specifically trained on Bumble's extensive internal dataset of inappropriate images to maximize its accuracy and efficiency. Extensive details about the model and its capabilities are documented in Bumble's whitepapers on the subject, which provide deeper insights into the model's purpose and build.
Running Inference
One of the powerful features of the Private Detector is its user-friendly inference process. The model is shared as a SavedModel, which supports multiple deployment methods, but for those less versed in Python or TensorFlow, there are straightforward steps to get started. First, you need to install Python and Conda on your computer. Then, use the provided environment.yaml
file to install necessary packages with the following commands:
conda env create -f environment.yaml
conda activate private_detector
Once your environment is set up, you can execute the inference script. Replace sample image paths with your images:
python3 inference.py \
--model saved_model/ \
--image_paths \
Yes_samples/1.jpg \
Yes_samples/2.jpg \
Yes_samples/3.jpg \
Yes_samples/4.jpg \
Yes_samples/5.jpg \
No_samples/1.jpg \
No_samples/2.jpg \
No_samples/3.jpg \
No_samples/4.jpg \
No_samples/5.jpg \
The output will show the probability scores indicating the likelihood of images being classified as lewd.
Deployment
To deploy the model effectively, Bumble offers examples using TensorFlow Serving. This allows for easy integration and deployment in a variety of environments and platforms, helping users integrate the model into existing systems seamlessly.
Enhancing the Model
Users interested in tailoring the model to their own datasets can further train the Private Detector. The process is simplified with the source provided in saved_checkpoint/
. Begin by creating JSON files to organize your image data into classes, specifying paths for each image category. Here's an example structure:
{
"Yes": {
"path": "/home/sofarrell/private_detector/Yes.txt",
"label": 0
},
"No": {
"path": "/home/sofarrell/private_detector/No.txt",
"label": 1
}
}
To initiate retraining, set up your environment and execute the retraining command:
conda env create -f environment.yaml
conda activate private_detector
python3 ./train.py \
--train_json /home/sofarrell/private_detector/train_classes.json \
--eval_json /home/sofarrell/private_detector/eval_classes.json \
--checkpoint_dir saved_checkpoint/ \
--train_id retrained_private_detector
The training script allows adjustments through various parameters to fine-tune performance according to specific needs, supporting custom setups for epoch count, batch size, learning rate, and more.
This open-source initiative by Bumble not only helps in moderating inappropriate content but also empowers developers to contribute to a safer online environment.