[논문 리뷰] Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report

논문 리뷰/경량화 논문 스터디

[논문 리뷰] Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report

공부중인학생 2022. 1. 3. 03:13

Depth estimation은 모바일 환경에서 필요한 컴퓨터 비전 기술 중 하나이지만 현재 제안된 솔루션들은 계산 비용이 많이 들어 장치에서의 추론 작업이 힘들었습니다. 본 논문은 이러한 문제점을 해결하기 위해서 여러 가지 방법들을 적용해보고 그 결과를 정리한 내용입니다.

1. Environments

Raspberri pi 4 (Broadcom BMC2711, Cortex-A72, 1.5 GHz)
Raspberry Pi OS (linux)
TensorFlow Lite 2.5.0 Linux build

2. Dataset

RGB-16bit-depth image pairs was collected using ZED stereo camera
- average depth estimation error of less than 0.2 [m]
- object located closer than 8 [m]
- around 8.3k VGA image pairs (640x480 pixels)

3. Scoring System

- Performance Measure

Root Mean Squared Error (RMSE, absolute depth estimation accuracy)
Relative Depth: Scale Invariant Root Mean Squared Error (si-RMSE, relative position of the object)
Average $\log_{10}$ and Relative (REL) error

Final Score

$$ Final Score = \frac {2^{-10 \cdot si - RMSE}}{C \cdot runtime}$$

4. Results and Discussion

- 참여 140팀 중 10팀이 완료

모든 참가자가 encoder-decoder 기반 architecture를 적용함.
거의 대부분의 참가자가 image classification backbone을 사용했고, 대부분의 architecture가 성능 개선을 위해 encoder-decoder 간의 skip-connection을 사용함.
- HIT-AIIA 팀만 EfficientNet-B1을 사용하고 나머지는 mobilenet을 사용
knowledge distillation도 사용됨.
가장 좋은 성능을 나타낸 Tencent GY-Lab. 의 경우 FastDepth 모델을 사용하여 Raspberry-pi 4에서 10 FPS 수준의 속도를 나타냄.

5. Challenge Method

(1) Tencent GY-Lab

MobileNet v3 기반 encoder를 적용한 U-Net like architecture 적용
각 output block은 decoder feature를 concate 하는 Feature Fusion Module (FFM)을 적용하여 신뢰도를 높임
ViT-Large 기반 teacher network를 먼저 학습시킨 후 knowledge distillation 적용
PyTorch → ONNX → TFLite로 변환

(2) SMART

MobileNet v1 backbone을 적용한 FastDetph architecture 적용
ResNeSt-101 기반 Teacher Network를 먼저 학습시킨 후 knowledge distillation 적용

(3) Airia-Team 1

MobileNet v3 backbone을 이용하여 dense feature 추출
Residual Feature Distillation Blocks (RFDB)과 Single Residual Blocks (SRB)를 적용함.

경량화를 할 때 Residual Block이나 Bottleneck, Prunning 밖에 안 사용해봤는데 encoder-decoder 기반 architecture나 konwledge distillation도 한번 사용해봐야겠습니다.

'논문 리뷰 > 경량화 논문 스터디' 카테고리의 다른 글

SNIP: Single-shot Network Pruning based on Connection Sensitivity - ICLR 2019 (2)	2022.12.15
[논문 리뷰]The Lottery Ticket Hypothesis, ICLR 2019 (0)	2021.12.04
[논문 리뷰] Distilling the Knowledge in a Neural Network (NIPS 2014 Workshop) (0)	2021.10.26
[논문 리뷰] GhostNet: More Features from Cheap Operations, CVPR 2020 (0)	2021.10.06
[논문 리뷰]AdderNet: Do We Really Need Multiplications in Deep Learning?, CVPR 2020 (0)	2021.09.29

현재글[논문 리뷰] Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report

비전공자이기에 열심히 노력하는 중입니다! https://github.com/audrb1999

Today :
Yesterday :

ML, DL 정리 블로그