Visloc-Retrieval: Library for Image Retrival and Place Recognition

Content

Introduction
Welcome
Datasets
Training
Roadmap
Results
Licenses
Citing

Introduction:

Visloc-Retrieval library is a collection of an image retrieval and place recognition models and algorithms for robotics and autonomous system applications.

Welcome

Welcome to the Visloc-Retrieval ✨

The library can be installed with pip:

pip install git+https://github.com/Tarekbouamer/visloc_retrieval.git

Main system requirements:

Python 3.7
CUDA 11.3
Faiss library (faiss-gpu 1.7.2)

conda create -n loc
conda activate loc
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
conda install -c conda-forge faiss-gpu 
pip install gdown timm==0.5.4

Pretrained models can be loaded using retrieval.create_model:

import retrieval

extractor = retrieval.create_model('resnet50_c4_gem_1024')
extractor.eval()

List available model architectures:

import retrieval

model_names = retrieval.list_models('*resne*t*')
print(model_names)
>>> [
  'resnet50_c4_gem_1024'
  ...
]

⬆️

Datasets

The models are trained in well known public dataset as:

Retrieval-SfM-120k-images retrieval-SfM-120k-images
Google Landmark v1, v2 and v2_clean.

We evaluate trained models on the RParis6K and ROxford5K benchmark and with R1M if possible. The datasets contain 4993 and 6322 images respectively, with 70 query images. Optionally, a set of 1 million distractor images (R1M) can be added to each dataset for large scale benchmarking.

Retrieval-SfM-120k-images and Revisiting Oxford and Paris datasets can be downloaded using TODO script or automatically in training script below.

Note! Other training and evaluation datasets will be added soon !!!

Training

We trained our models using the training script with a specific configuration provided in ini format as in default, where you can specify the trainning loss, optimizer, scheduler, training and test datasets, data augmentation policy if needed ... etc

start your training:

DATA_DIR='/......./data'
EXPERIMENT='./experiments/'
CONFIG='./image_retrieval/configuration/defaults/defaults.ini'

python3 ./scripts/train.py \
      --directory $EXPERIMENT \
      --data $DATA_DIR \
      --local_rank 0 \
      --config $CONFIG \
      --eval                # evaluate during training and save best model

Testing

You can evaluate our pretrained models using test script and default configuration config.ini file. The evaluation script extracts image features with a max_size=1024 in a single scale, multi-scale feature extraction and evaluation is also supported and the list of scales can be intoduced as (--scales 0.7071,1.0,1.4142). Our results are presented in Results section.

DATA_DIR='/......./data'
MODEL='resnet50_gem_2048'
SCALES=0.7071,1.0,1.4142

python3 ./scripts/test.py \
      --data $DATA_DIR \
      --model $MODEL \
      --scales $SCALES \

⬆️

Roadmap

Add Changelog
Add shield IO
Add results on roxford5k rparis6k.
Knowledge distillation smaller models and shorter sizes.

⬆️

Results

🟦 Global Benchmark

Single-Scale

Models		ROxford5k			RParis6k
	Easy	Medium	Hard	Easy	Medium	Hard
sfm_resnet50_gem_2048	83.83	66.01	38.96	91.83	77.16	55.82
sfm_resnet50_c4_gem_1024	79.22	60.53	34.30	89.24	71.77	49.14
sfm_resnet101_gem_2048	82.80	66.26	40.39	91.29	75.23	53.21
sfm_resnet101_c4_gem_1024	82.12	62.81	36.56	90.44	74.64	52.67
gl18_resnet101_gem_2048	81.79	65.58	40.72	91.38	76.71	56.63
sfm_resnet18_how_128	61.61	46.67	22.37	80.52	62.20	33.79
sfm_resnet50_c4_how_128	51.80	36.76	11.99	75.29	58.14	31.94

Multi-Scale

Models		ROxford5k			RParis6k
	Easy	Medium	Hard	Easy	Medium	Hard
sfm_resnet50_gem_2048	84.96	67.19	40.45	92.67	78.39	57.84
sfm_resnet50_c4_gem_1024	80.99	61.90	34.90	90.20	72.58	49.98
sfm_resnet101_gem_2048	83.65	66.88	40.60	92.11	76.63	55.11
sfm_resnet101_c4_gem_1024	83.94	64.41	38.09	91.66	76.70	55.28
gl18_resnet101_gem_2048	84.76	68.05	43.42	93.25	79.75	61.14
sfm_resnet18_how_128	63.70	48.19	24.93	83.03	64.51	36.35
sfm_resnet50_c4_how_128	52.71	37.36	12.37	75.82	58.71	32.56

🟩 Local Feature Aggregation Benchmark

Single-Scale

Models	Algo	N		ROxford5k			RParis6k
			Easy	Medium	Hard	Easy	Medium	Hard
sfm_resnet18_how_128	ASMK-64k	1000	68.21	54.50	31.51	83.07	64.53	37.68
sfm_resnet50_c4_how_128	ASMK-64k	1000	76.60	61.69	39.83	89.87	72.35	50.32

Multi-Scale

Models	Algo	N		ROxford5k			RParis6k
			Easy	Medium	Hard	Easy	Medium	Hard
sfm_resnet18_how_128	ASMK-64k	1000	84.68	68.06	44.70	91.82	73.69	49.87
sfm_resnet50_c4_how_128	ASMK-64k	1000	85.72	69.84	47.27	92.46	75.28	54.73

ℹ️ Moreover, we use 3 image scales [ 0.7071, 1.0, 1.4142 ] to extract global discriptors to benchmark our trained model, with a minimum size 100 and a maximum area 2000*2000.

⬆️

Licenses

Code

The code here is licensed Apache 2.0. We have linked the sources and references for different third parties under permissive licenses such as MIT BSD, . If you think wehave missed anything please create an issue.

Timm Weights

So far Timm library weights are pretrained:

ImageNet released for non-commercial research purposes only (https://image-net.org/download).
Facebook models WSL, SSL, SWSL ResNe(Xt), ... and the Google Noisy Student EfficientNet have an explicit non-commercial license (CC-BY-NC 4.0, https://github.com/facebookresearch/semi-supervised-ImageNet1K-models, https://github.com/facebookresearch/WSL-Images).
Google models do not appear to have any restriction beyond the Apache 2.0 license (and ImageNet concerns).

Visloc_retrieval Weights

Different datasets are used to train our models:

SFM Dataset (120k and 30k), Licence not mentioned in (http://cmp.felk.cvut.cz/cnnimageretrieval/).
Google Landmarks Dataset v2 images have CC-BY licenses without the NonDerivs (ND) restriction. To verify the license for a particular image, please refer to train_attribution.csv file (https://github.com/cvdfoundation/google-landmark#train-image-licenses).
Google Landmarks Dataset v1, The images listed in this dataset are publicly available on the web, and may have different licenses(https://www.kaggle.com/datasets/google/google-landmarks-dataset).

In either case, refere to timm library for pretrained models licence and/or datasets providers with any question.

⬆️

Citing

⬆️

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
notebook		notebook
results		results
retrieval		retrieval
scripts		scripts
thirdparty		thirdparty
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visloc-Retrieval: Library for Image Retrival and Place Recognition

Content

Introduction:

Welcome

Datasets

Training

Testing

Results

🟦 Global Benchmark

🟩 Local Feature Aggregation Benchmark

Licenses

Code

Timm Weights

Visloc_retrieval Weights

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Visloc-Retrieval: Library for Image Retrival and Place Recognition

Content

Introduction:

Welcome

Datasets

Training

Testing

Results

🟦 Global Benchmark

🟩 Local Feature Aggregation Benchmark

Licenses

Code

Timm Weights

Visloc_retrieval Weights

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages