[논문] Unsupervised Monocular Depth Learning with Integrated Intrinsics and Spatio-Temporal Constraints

1. Motivation

Self-supervised monocular depth estimation에서 카메라 intrinsic matrix는 대부분 사전에 알려져 있다고 가정된다. 그러나 실제 환경에서는 카메라 파라미터를 항상 알 수 없는 경우가 많다. 이 논문은 알려지지 않은 카메라 파라미터를 네트워크가 스스로 예측하도록 하고, 이를 손실 함수에 통합하는 방법을 제안한다.

관련된 선행 연구로는 다음이 있다:

[12] Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras: 카메라 파라미터를 모르는 상황에서의 depth 학습을 다룬 연구
[30] Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
[31] UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning

이 연구들은 카메라 파라미터를 unknown으로 두거나, depth와 pose를 동시에 학습하는 방향을 탐구했다.

3. Proposed Method

이 논문의 핵심 기여는 두 가지다:

싱글 네트워크로 Depth + Intrinsics 동시 예측

하나의 통합 네트워크가 depth map과 함께 카메라 intrinsic matrix(focal length, principal point 등)를 동시에 예측한다. 이를 통해 카메라 파라미터가 알려지지 않은 환경에서도 self-supervised depth 학습이 가능해진다.

Intrinsics를 통합한 손실 함수 설계

알려지지 않은 카메라 파라미터를 네트워크에서 직접 예측하고, 이를 photometric reconstruction loss에 통합한다. Spatio-temporal constraints를 활용하여 예측의 일관성을 강제한다.

4. Experiments

Unknown camera setting에서의 depth 및 intrinsics 예측 성능 평가
Spatio-temporal 일관성 constraints가 학습 안정성에 기여함을 실험으로 검증

5. Conclusion & Limitation

카메라 파라미터를 알 수 없는 환경에서도 depth 학습이 가능하도록 intrinsics 예측을 depth estimation과 통합한 점이 이 연구의 주요 기여다. 다만, 네트워크가 intrinsics까지 예측해야 하므로 학습의 복잡도가 증가하고, 파라미터 추정의 정확성이 depth 품질에 직접 영향을 미치는 만큼 안정적인 학습이 어려울 수 있다.

1. Motivation#

2. Related Work#

3. Proposed Method#

싱글 네트워크로 Depth + Intrinsics 동시 예측#

Intrinsics를 통합한 손실 함수 설계#

4. Experiments#

5. Conclusion & Limitation#