Yinan's Personal Website

Welcome to Yinan’s Personal Website. Feel free to visit my cyber home!

Deep Learning Remote Sensing Building Extraction

Building Footprint Extraction with U-Net

Automatic extraction of building footprints from NAIP imagery using a ResNet-50 U-Net and OpenStreetMap labels

This post demonstrates how we trained a U-Net (with a ResNet-50 encoder pretrained on ImageNet) to extract building footprints from high-resolution NAIP imagery using OpenStreetMap polygons as labels.

Data

Input imagery: NAIP 1 m RGB tiles over Orange County, CA
Labels: Rasterized building footprints from OpenStreetMap

Model & Training

Architecture: U-Net with a ResNet-50 encoder, skip-connections, and sigmoid output for binary masks
Loss: Dice loss to handle class imbalance
Optimizer: Adam (lr = 1e-4) with CosineAnnealingWarmRestarts
Augmentations: Random flips, 90° rotations, affine jitter
Training: 20 epochs on 512×512 patches (10 initial + 10 resumed)

Convergence Curves

Loss vs Epoch / IoU vs Epoch

Left: Dice loss decreases on both training (blue) and validation (orange).
Right: Intersection-over-Union rises from ~0.52 up to ~0.74 over 20 epochs.

Sample Predictions

Sample 1

Sample 1

Sample 2

Sample 2

Sample 3

Sample 3

Sample 4

Sample 4

Vectorized Footprints

After cleaning and simplifying the predicted masks, we convert them to vector polygons and overlay on the original imagery:

Cleaned Building Footprints Overlay

References

NAIP imagery: https://www.fsa.usda.gov/programs-and-services/aerial-photography/imagery-programs/naip-imagery
OpenStreetMap building footprints: https://download.geofabrik.de/
Segmentation Models PyTorch: https://github.com/qubvel/segmentation_models.pytorch