urban_seg - Beginner-Friendly Remote Sensing Segmentation Using Unicom

Introduction to the Urban_Seg Project

Urban_Seg is an exciting project focusing on semantic segmentation using remote sensing images, and it is particularly aimed at beginners in the field. At its core, it leverages the powerful pre-trained model called Unicom, which is trained on a massive scale of 400 million images. This model is known for its high efficiency and outstanding performance in remote sensing segmentation tasks. What's truly remarkable is that the Urban_Seg project achieves impressive results even when trained with only 4 remote sensing images.

Getting Started with Urban_Seg

For those looking to quickly dive into the project, Urban_Seg offers a convenient script named train_one_gpu.py. This straightforward script is just 200 lines of code and allows users to initiate the training process with ease. On the other hand, users aiming for enhanced performance can use a more advanced script, train_multi_gpus.py, which supports multi-GPU training, given the correct configurations and settings are applied. It is essential to follow the guidance provided in the documentation for a smooth experience.

Installation

To install Urban_Seg, the following steps are recommended:

git clone https://github.com/anxiangsir/urban_seg.git
pip install -r requirements.txt

Data and Pretrained Models

The dataset used in the Urban_Seg project is part of the CCF AI classification and recognition challenge, consisting of five satellite remote sensing images. This dataset is available via a Baidu Cloud link, and participants can access it using the credentials provided.

The project's directory structure includes:

dataset
├── origin //5 remote sensing images with labels
├── test   //3 remote sensing images without labels, not used in this project
└── train  //initially empty, filled through data preprocessing
    ├── images       
    └── labels

The project also includes several pre-trained models such as FP16-ViT-B-32.pt, FP16-ViT-L-14.pt, and others, which users can utilize during their experiments.

Training on 1 GPU

The steps to train the model using a single GPU are as follows:

Download the dataset to the current directory.
Preprocess the data using:

python preprocess.py

Train the model using:

python train_one_gpu.py

Training on 8 GPUs

For those with access to multiple GPUs and desiring more computational power, the steps are similar but with the script optimized for multi-GPU usage:

Download the dataset to the current directory.
Preprocess the data:

python preprocess.py

Train the model across 8 GPUs using:

torchrun --nproc_per_node 8 train_multi_gpus.py

Community and Feedback

The Urban_Seg project also encourages community interaction for feedback and discussion. Interested individuals can join the QQ Group: 679897018 for more insightful discussions.

Citing Urban_Seg

If the Urban_Seg project proves beneficial in your endeavors, the team invites users to cite their work to acknowledge its utility:

@inproceedings{anxiang_2023_unicom,
  title={Unicom: Universal and Compact Representation Learning for Image Retrieval},
  author={An, Xiang and Deng, Jiankang and Yang, Kaicheng and Li, Jiawei and Feng, Ziyong and Guo, Jia and Yang, Jing and Liu, Tongliang},
  booktitle={ICLR},
  year={2023}
}

The Urban_Seg project represents a significant stride in making advanced remote sensing segmentation accessible and manageable for beginners, providing top-tier tools and a supportive community.