GPTEval3D: A Comprehensive Overview
GPTEval3D is a sophisticated tool dedicated to evaluating text-to-3D generative models, rooted in the research publication "GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation". This project provides a reliable evaluation metric tailored for models transforming textual prompts into 3D visualizations.
Latest Updates
The creators of GPTEval3D have made available 110 expertly chosen image prompts, each paired with a specific text prompt. These images undergo background removal through tools like rembg and Clipdrop. Interested users can access the image gallery via this Google Drive link.
Setting Up the Environment
To run GPTEval3D, the primary dependencies are OpenAI’s API and PyTorch. The specific instructions for installing PyTorch depend on the user’s environment. However, here is the basic guide for installing the OpenAI API and other essential packages:
# Install OpenAI API
pip install --upgrade openai
# Additional packages
pip install --upgrade tqdm numpy Pillow gdown
Evaluating a Text-to-3D Model
Step 1: Data Acquisition
The initial step involves downloading specific data sets. For a comprehensive understanding of data formats, users should consult this document. Test datasets consist of 13 methods, 110 prompts, and 120 RGB and normal map renderings for each. Data can be downloaded from this Google Drive link.
cd data/tournament-v0
gdown "https://drive.google.com/uc?id=1pYmSRu_oMy_v6f7ngnkFER6PNWmJAe52"
unzip methods
Step 2: Data Preparation
Within the data/tournament-v0
folder, locate the prompts.json
file. Based on each prompt within, utilize your text-to-3D generative model to generate one or more corresponding 3D shapes. Render 120 views of each shape using a certain camera angle from the Threestudio codebase, ensuring a resolution of 512x512. Besides RGB renderings, equivalent surface normal renderings should be made. These outputs are then organized as per the folder structure:
- data/<your_method_name>/
# Prompt from zero
- <prompt-id-1>/
-<seed1>
rgb_001.png
...
rgb_119.png
normal_001.png
...
normal_119.png
...
Step 3: Executing the Evaluation
With data formatted correctly, the evaluation can be executed using the following command, which will yield an ELO score situating your method within the existing framework:
python gpt_eval_alpha.py \
--apikey <your_openai_api_key> \
--eval new_method \ # Evaluating new method
-t data/t23d-tournament-v0 \ # folder to tournament data
-m data/<your_method_name> \ # folder to method
-o results/<your_method_name> # (optional) output directory
Tournament Score Computation
Step 1: Data Organization
Ensure that a set of text-to-3D models is structured as follows:
<root>
config.json
prompts.json
methods/
<method-name-1>
<prompt-id-1>
<seed-1>
rgb_0.png ...
normal_0.png ...
...
<seed-k>
...
<prompt-id-m>
...
<method-name-n>
For details on config.json
and prompts.json
contents, refer to this link.
Step 2: Running the Evaluation
To evaluate within a tournament context, the following command is used:
python gpt_eval_alpha.py \
--apikey <your_openai_api_key> \
--eval tournament \ # Evaluating new method
-t <path-to-tournament-data> \ # folder to tournament data
-b 200 \ # budget (number of requests)
-o results/<tournament-name> # (optional) output directory
Upcoming Features
Stay tuned for additional visualization tools and utility features, alongside a Text-to-3D leaderboard for tracking performance.
Acknowledgements
The success of GPTEval3D is underpinned by the contributions of several noteworthy projects, including GPT-4V, threestudio, mvdream, and others. Their groundwork and codes have greatly enhanced this project’s capability and reliability.