private-detector - Open-Source EfficientNet Classifier for Identifying Lewd Images

Private Detector Project Introduction

Overview

Bumble has released an innovative open-source project named the Private Detector. This project centers around an image classification model engineered to recognize lewd images. By sharing their Private Detector model with the public, Bumble allows the wider community to use, analyze, and enhance the technology according to their needs. The model is accessible for download and includes various implementations such as a pretrained SavedModel, and a Frozen Model available directly from the repository.

Model Details

The Private Detector model is an advanced tool built utilizing the EfficientNet-v2 architecture. It has been specifically trained on Bumble's extensive internal dataset of inappropriate images to maximize its accuracy and efficiency. Extensive details about the model and its capabilities are documented in Bumble's whitepapers on the subject, which provide deeper insights into the model's purpose and build.

Running Inference

One of the powerful features of the Private Detector is its user-friendly inference process. The model is shared as a SavedModel, which supports multiple deployment methods, but for those less versed in Python or TensorFlow, there are straightforward steps to get started. First, you need to install Python and Conda on your computer. Then, use the provided environment.yaml file to install necessary packages with the following commands:

conda env create -f environment.yaml
conda activate private_detector

Once your environment is set up, you can execute the inference script. Replace sample image paths with your images:

python3 inference.py \
    --model saved_model/ \
    --image_paths \
        Yes_samples/1.jpg \
        Yes_samples/2.jpg \
        Yes_samples/3.jpg \
        Yes_samples/4.jpg \
        Yes_samples/5.jpg \
        No_samples/1.jpg \
        No_samples/2.jpg \
        No_samples/3.jpg \
        No_samples/4.jpg \
        No_samples/5.jpg \

The output will show the probability scores indicating the likelihood of images being classified as lewd.

Deployment

To deploy the model effectively, Bumble offers examples using TensorFlow Serving. This allows for easy integration and deployment in a variety of environments and platforms, helping users integrate the model into existing systems seamlessly.

Enhancing the Model

Users interested in tailoring the model to their own datasets can further train the Private Detector. The process is simplified with the source provided in saved_checkpoint/. Begin by creating JSON files to organize your image data into classes, specifying paths for each image category. Here's an example structure:

{
    "Yes": {
        "path": "/home/sofarrell/private_detector/Yes.txt",
        "label": 0
    },
    "No": {
         "path": "/home/sofarrell/private_detector/No.txt",
         "label": 1
    }
}

To initiate retraining, set up your environment and execute the retraining command:

conda env create -f environment.yaml
conda activate private_detector

python3 ./train.py \
    --train_json /home/sofarrell/private_detector/train_classes.json \
    --eval_json /home/sofarrell/private_detector/eval_classes.json \
    --checkpoint_dir saved_checkpoint/ \
    --train_id retrained_private_detector

The training script allows adjustments through various parameters to fine-tune performance according to specific needs, supporting custom setups for epoch count, batch size, learning rate, and more.

This open-source initiative by Bumble not only helps in moderating inappropriate content but also empowers developers to contribute to a safer online environment.