GitHub - WFP-VAM/roof_detex: training a FCNN to count huts in IDP and refugee camps from VHR satellite images.

! WORK IN PROGRESS !

Codebase for training models to detect structures from very high resolution satellite images (~0.4m).

data preprocessing

scripts to preprocess the data into images and lables are in scripts/. Each set of images and labels used for training come in different format so it requires custom preprocessing. All the different data sources are converted to 256x256 3 RGB images. 4 scripts are developed to handle 4 different data streams:

giveDirectly_dataprep.py, to prepare images and labels provided by the people behind this paper. 1468 google maps images, 400 by 400, zoom level = 16. Between 0 and 24 huts per image, with an average between 5 and 6.
rms_dataprep.py to prepare proprietary satellite and labels genrated from VAM's geospatial team. These are very high resolution RGB images labelled from the remote monitoring team at WFP. 1 channel containing the mask (1 for roof, 0 for no roof)
dstl_dataprep.py to prepare images and labels from the DSTL Kaggle competition.
spacenet_dataprep.py to prepare images and labels fof the RGB Khartoum image set from Spacenet.

The labels for each dataset change slightly: for VAM data huts are identified of one pixel for where the roof is, i.e. 1 pixel per roof. To reduce class imbalance between roofs and not roofs, a buffer is added around the single pixels, between 3x3 to 9x9 depending on image. So the final mask is a square that idially overlaps with the roof. One of the big improvements to do here.

Training

batch_training.py trains the netowrk in batches using a generator to load the images in memory at run-time. use python batch_training.py --help to get avaialble parameters, including directory of the training data, what weights to use and what images' names to use.

src/

number_of_islands.py class to count islands in boolean 2D matrix.
unet.py the network architecture, UNet. utils all the handy stuff, inclusing cv and IO routines. Loss funciton, Dice Coefficient, is defined here too.

Infrastructure

AWS g2.2xlarge

Results:

On VAM's data, the Dice coefficient on the validation set saturates around 0.35 (they are noisy labels, not a bad score if you look at predictions).

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
scripts		scripts
src		src
README.md		README.md
batch_training.py		batch_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data preprocessing

Training

src/

Infrastructure

Results:

About

Releases

Packages

Languages

WFP-VAM/roof_detex

Folders and files

Latest commit

History

Repository files navigation

data preprocessing

Training

src/

Infrastructure

Results:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages