DepthAnything, which estimates the depth of anything

MLBoy
2 min readApr 11, 2024

Estimating depth with a single image

Once you know the estimated depth this easily, you can do a lot of things.

How to use

install

git clone https://github.com/LiheYoung/Depth-Anything.git
cd Depth-Anything
pip install -r requirements.txt

execution

from depth_anything.dpt import DepthAnything
from depth_anything.util.transform import Resize, NormalizeImage, PrepareForNetimport cv2
import torch
from torchvision.transforms import Compose
import torch.nn.functional as F
import numpy as np

encoder = 'vits' # can also be 'vitb' or 'vitl'
depth_anything = DepthAnything.from_pretrained('LiheYoung/depth_anything_{:}14'.format(encoder)).eval()
transform = Compose([
Resize(
width=518,
height=518,
resize_target=False,
keep_aspect_ratio=True,
ensure_multiple_of=14,
resize_method='lower_bound',
image_interpolation_method=cv2.INTER_CUBIC,
),
NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
PrepareForNet(),
])

image = cv2.cvtColor(cv2.imread('bridge.jpg'), cv2.COLOR_BGR2RGB) / 255.0
image = transform({'image': image})['image']
input = torch.from_numpy(image).unsqueeze(0)
# depth shape: 1xHxW
result = depth_anything(input)
depth = result
h, w = image.shape[:2]
depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
depth = depth.detach().cpu().numpy().astype(np.uint8)
depth = depth[0]
cv2.imwrite("depth.jpg",depth)

You can do it with video too.

python run_video.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis

🐣

I’m a freelance engineer.
Work consultation
Please feel free to contact us with a brief development description.
rockyshikoku@gmail.com

I am creating applications using machine learning and AR technology.

I send machine learning / AR related information.

GitHub

Twitter
Medium

--

--