defget_ndc_rays(H, W, focal, near, rays_o, rays_d):"""
Transform rays from world coordinate to NDC.
NDC: Space such that the canvas is a cube with sides [-1, 1] in each axis.
For detailed derivation, please see:
http://www.songho.ca/opengl/gl_projectionmatrix.html
https://github.com/bmild/nerf/files/4451808/ndc_derivation.pdf
In practice, use NDC "if and only if" the scene is unbounded (has a large depth).
See https://github.com/bmild/nerf/issues/18
Inputs:
H, W, focal: image height, width and focal length
near: (N_rays) or float, the depths of the near plane
rays_o: (N_rays, 3), the origin of the rays in world coordinate
rays_d: (N_rays, 3), the direction of the rays in world coordinate
Outputs:
rays_o: (N_rays, 3), the origin of the rays in NDC
rays_d: (N_rays, 3), the direction of the rays in NDC
"""# 将光线从世界坐标转换为NDC。## NDC: 这样的空间,画布是一个立方体,每个轴的边都是[- 1, 1]。## 有关详细的推导,请参阅:## http: // www.songho.ca / opengl / gl_projectionmatrix.html## M## 在实践中,使用NDC“当且仅当”场景是无界的(有很大的深度)。## 参见https: // github.com / bmild / nerf / issues / 18## 输入:## H、W、焦距: 图像高度、宽度和焦距## near: (N_rays)# 或float,近平面的深度## rays_o: (N_rays, 3),世界坐标中射线的原点## rays_d: (N_rays, 3),射线的世界坐标方向## 输出:## rays_o: (N_rays, 3), NDC中射线的原点## rays_d: (N_rays, 3), NDC中的射线方向# Shift ray origins to near plane
t =-(near + rays_o[...,2])/ rays_d[...,2]
rays_o = rays_o + t[...,None]* rays_d
# Store some intermediate homogeneous results
ox_oz = rays_o[...,0]/ rays_o[...,2]
oy_oz = rays_o[...,1]/ rays_o[...,2]# Projection
o0 =-1./(W /(2.* focal))* ox_oz
o1 =-1./(H /(2.* focal))* oy_oz
o2 =1.+2.* near / rays_o[...,2]
d0 =-1./(W /(2.* focal))*(rays_d[...,0]/ rays_d[...,2]- ox_oz)
d1 =-1./(H /(2.* focal))*(rays_d[...,1]/ rays_d[...,2]- oy_oz)
d2 =1- o2
rays_o = torch.stack([o0, o1, o2],-1)# (B, 3)
rays_d = torch.stack([d0, d1, d2],-1)# (B, 3)return rays_o, rays_d
提问如下:
回答:现在将进行数学公式的一个推断
START-------------------------START
NDC ray space derivation光线空间推导
齐次坐标的标准 3D 透视投影矩阵看起来像这样:
M
=
(
n
r
0
0
0
0
n
t
0
0
0
0
−
(
f
+
n
)
f
−
n
−
2
f
n
f
−
n
0
0
−
1
0
)
M=\left(\begin{array}{cccc} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{-(f+n)}{f-n} & \frac{-2 f n}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right)
M=rn0000tn0000f−n−(f+n)−100f−n−2fn0
n
,
f
n, f
n,f 是near 平面和far clipping平面,
r
r
r 和
t
t
t是右 clipping平面和远 clipping平面,近cliping平面处场景的顶部边界.
1
{ }^1
1 (Note 这是在相机朝 -z 方向的观察.)
投影点现在位于标准化设备坐标 (NDC) 空间中,其中原始视锥体已映射到立方体
(
x
,
y
,
z
,
1
)
⊤
(x, y, z, 1)^{\top}
(x,y,z,1)⊤, 先左乘
M
\mathrm{M}
M 并且除以第4个坐标:
(
n
r
0
0
0
0
n
t
0
0
0
0
−
(
f
+
n
)
f
−
n
−
2
f
n
f
−
n
0
0
−
1
0
)
(
x
y
z
1
)
=
(
n
r
x
n
t
y
−
(
f
+
n
)
f
−
n
z
−
−
2
f
n
f
−
n
−
z
)
project
→
(
n
r
x
−
z
n
t
y
−
z
(
f
+
n
)
f
−
n
−
2
f
n
f
−
n
1
−
z
)
\begin{aligned} &\left(\begin{array}{cccc} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{-(f+n)}{f-n} & \frac{-2 f n}{f-n} \\ 0 & 0 & -1 & 0 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \\ 1 \end{array}\right)=\left(\begin{array}{c} \frac{n}{r} x \\ \frac{n}{t} y \\ \frac{-(f+n)}{f-n} z-\frac{-2 f n}{f-n} \\ -z \end{array}\right) \\ & \text { project } \rightarrow\left(\begin{array}{c} \frac{n}{r} \frac{x}{-z} \\ \frac{n}{t} \frac{y}{-z} \\ \frac{(f+n)}{f-n}-\frac{2 f n}{f-n} \frac{1}{-z} \end{array}\right) \end{aligned}
rn0000tn0000f−n−(f+n)−100f−n−2fn0xyz1=rnxtnyf−n−(f+n)z−f−n−2fn−z project →rn−zxtn−zyf−n(f+n)−f−n2fn−z1 投影点现在位于标准化设备坐标 (NDC) 空间中,其中原始视锥体已映射到立方体
[
−
1
,
1
]
3
[-1,1]^3
[−1,1]3.
我们的目标是利用光线
o
+
t
d
\mathbf{o}+t \mathbf{d}
o+td 并计算光线的起点
o
′
\mathbf{o}^{\prime}
o′ 以及方向
d
′
\mathbf{d}^{\prime}
d′ 在NDC空间,对于每一个
t
t
t,都存在一个对应的
t
′
t^{\prime}
t′ 表达为
π
(
o
+
t
d
)
=
o
′
+
t
′
d
′
\pi(\mathbf{o}+t \mathbf{d})=\mathbf{o}^{\prime}+t^{\prime} \mathbf{d}^{\prime}
π(o+td)=o′+t′d′ (
π
\pi
π 是使用上述矩阵的投影).
我们将投影点重写为
(
a
x
x
/
z
,
a
y
y
/
z
,
a
z
+
b
z
/
z
)
⊤
\left(a_x x / z, a_y y / z, a_z+b_z / z\right)^{\top}
(axx/z,ayy/z,az+bz/z)⊤ 使其不那么凌乱。现在将写出我们需要满足的所有约束:
(
a
x
o
x
+
t
d
x
o
z
+
t
d
z
a
y
o
y
+
t
d
y
o
z
+
t
d
z
a
z
+
b
z
o
z
+
t
d
z
)
=
(
o
x
′
+
t
′
d
x
′
o
y
′
+
t
′
d
y
′
o
z
′
+
t
′
d
z
′
)
\left(\begin{array}{c} a_x \frac{o_x+t d_x}{o_z+t d_z} \\ a_y \frac{o_y+t d_y}{o_z+t d_z} \\ a_z+\frac{b_z}{o_z+t d_z} \end{array}\right)=\left(\begin{array}{c} o_x^{\prime}+t^{\prime} d_x^{\prime} \\ o_y^{\prime}+t^{\prime} d_y^{\prime} \\ o_z^{\prime}+t^{\prime} d_z^{\prime} \end{array}\right)
axoz+tdzox+tdxayoz+tdzoy+tdyaz+oz+tdzbz=ox′+t′dx′oy′+t′dy′oz′+t′dz′ 为了方便起见,我们将决定
t
′
=
0
t^{\prime}=0
t′=0 and
t
=
0
t=0
t=0 应该映射到同一点。 这为我们提供了 NDC 空间
o
′
\mathbf{o}^{\prime}
o′
o
′
=
(
o
x
′
o
y
′
o
z
′
)
=
(
a
x
o
x
o
z
a
y
o
y
o
z
a
z
+
b
z
o
−
)
=
π
(
o
)
\mathbf{o}^{\prime}=\left(\begin{array}{c} o_x^{\prime} \\ o_y^{\prime} \\ o_z^{\prime} \end{array}\right)=\left(\begin{array}{c} a_x \frac{o_x}{o_z} \\ a_y \frac{o_y}{o_z} \\ a_z+\frac{b_z}{o_{-}} \end{array}\right)=\pi(\mathbf{o})
o′=ox′oy′oz′=axozoxayozoyaz+o−bz=π(o)
这只是original ray 光线起点的投影
π
(
o
)
\pi(\mathbf{o})
π(o) . 现在我们可以计算
t
′
t^{\prime}
t′ 和
d
′
\mathbf{d}^{\prime}
d′ .
(
t
′
d
x
′
t
′
d
y
′
t
′
d
z
′
)
=
(
a
x
o
x
+
t
d
x
o
z
+
t
d
z
−
a
x
o
x
o
z
a
y
o
y
+
t
d
y
o
z
+
t
d
z
−
a
y
o
y
o
z
a
z
+
b
z
o
z
+
t
d
z
−
a
z
−
b
z
o
z
)
=
(
a
x
o
z
(
o
x
+
t
d
x
)
−
o
x
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
a
y
o
z
(
o
y
+
t
d
y
)
−
o
y
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
b
z
o
z
−
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
)
=
(
a
x
t
d
z
o
z
+
t
d
z
(
d
x
d
z
−
o
x
o
z
)
a
y
t
d
z
o
z
+
t
d
z
(
d
y
d
z
−
o
y
o
z
)
−
b
z
t
d
z
o
z
+
t
d
z
1
o
z
)
\begin{aligned} \left(\begin{array}{l} t^{\prime} d_x^{\prime} \\ t^{\prime} d_y^{\prime} \\ t^{\prime} d_z^{\prime} \end{array}\right) & =\left(\begin{array}{c} a_x \frac{o_x+t d_x}{o_z+t d_z}-a_x \frac{o_x}{o_z} \\ a_y \frac{o_y+t d_y}{o_z+t d_z}-a_y \frac{o_y}{o_z} \\ a_z+\frac{b_z}{o_z+t d_z}-a_z-\frac{b_z}{o_z} \end{array}\right) \\ & =\left(\begin{array}{c} a_x \frac{o_z\left(o_x+t d_x\right)-o_x\left(o_z+t d_z\right)}{\left(o_z+t d_z\right) o_z} \\ a_y \frac{o_z\left(o_y+t d_y\right)-o_y\left(o_z+t d_z\right)}{\left(o_z+t d_z\right) o_z} \\ b_z \frac{o_z-\left(o_z+t d_z\right)}{\left(o_z+t d_z\right) o_z} \end{array}\right) \\ & =\left(\begin{array}{c} a_x \frac{t d_z}{o_z+t d_z}\left(\frac{d_x}{d_z}-\frac{o_x}{o_z}\right) \\ a_y \frac{t d_z}{o_z+t d_z}\left(\frac{d_y}{d_z}-\frac{o_y}{o_z}\right) \\ -b_z \frac{t d_z}{o_z+t d_z} \frac{1}{o_z} \end{array}\right) \end{aligned}
t′dx′t′dy′t′dz′=axoz+tdzox+tdx−axozoxayoz+tdzoy+tdy−ayozoyaz+oz+tdzbz−az−ozbz=ax(oz+tdz)ozoz(ox+tdx)−ox(oz+tdz)ay(oz+tdz)ozoz(oy+tdy)−oy(oz+tdz)bz(oz+tdz)ozoz−(oz+tdz)=axoz+tdztdz(dzdx−ozox)ayoz+tdztdz(dzdy−ozoy)−bzoz+tdztdzoz1 我们可以分解出一个仅依赖于
t
t
t 来得到:
t
′
=
t
d
z
o
z
+
t
d
z
=
1
−
o
z
o
z
+
t
d
z
d
′
=
(
a
x
(
d
x
d
z
−
o
x
o
z
)
a
y
(
d
y
d
z
−
o
y
o
z
)
−
b
z
1
o
z
)
\begin{aligned} & t^{\prime}=\frac{t d_z}{o_z+t d_z}=1-\frac{o_z}{o_z+t d_z} \\ & \mathbf{d}^{\prime}=\left(\begin{array}{c} a_x\left(\frac{d_x}{d_z}-\frac{o_x}{o_z}\right) \\ a_y\left(\frac{d_y}{d_z}-\frac{o_y}{o_z}\right) \\ -b_z \frac{1}{o_z} \end{array}\right) \end{aligned}
t′=oz+tdztdz=1−oz+tdzozd′=ax(dzdx−ozox)ay(dzdy−ozoy)−bzoz1 请注意,正如我们想要的那样,
t
′
=
0
t^{\prime}=0
t′=0 当
t
=
0
t=0
t=0. 现在更进一步,我们可以得到
t
′
→
1
t^{\prime} \rightarrow 1
t′→1 as
t
→
∞
t \rightarrow \infty
t→∞
现在回到原始的投影矩阵,我们看到我们的常数是
a
x
=
−
n
r
a
y
=
−
n
t
a
z
=
f
+
n
f
−
n
b
z
=
2
f
n
f
−
n
\begin{aligned} a_x & =-\frac{n}{r} \\ a_y & =-\frac{n}{t} \\ a_z & =\frac{f+n}{f-n} \\ b_z & =\frac{2 f n}{f-n} \end{aligned}
axayazbz=−rn=−tn=f−nf+n=f−n2fn 通过标准针孔相机数学,我们可以重新参数化为
a
x
=
−
f
c
a
m
W
/
2
a
y
=
−
f
c
a
m
H
/
2
\begin{aligned} & a_x=-\frac{f_{c a m}}{W / 2} \\ & a_y=-\frac{f_{c a m}}{H / 2} \end{aligned}
ax=−W/2fcamay=−H/2fcam
W
,
H
W, H
W,H 是图片image的宽和高 ,
f
c
a
m
f_{c a m}
fcam 是针孔相机的焦距.
在 NeRF 中,我们假设远场景边界是无穷大(这花费我们很少,因为 NDC 使用 z 维度来表示逆深度,即视差)。 在此限制下,z 常数简化为
a
z
=
1
b
z
=
2
n
\begin{aligned} & a_z=1 \\ & b_z=2 n \end{aligned}
az=1bz=2n 将所有内容组合在一起,我们得到 NeRF 代码中 ndc_rays() 函数中的表达式:
o
′
=
(
−
f
c
a
m
W
/
2
o
x
o
z
−
f
c
a
m
H
/
2
o
y
o
z
1
+
2
n
o
z
)
d
′
=
(
−
f
c
a
m
W
/
2
(
d
x
d
z
−
o
x
o
z
)
−
f
c
a
m
H
/
2
(
d
y
d
z
−
o
y
o
z
)
−
2
n
1
o
z
)
\begin{aligned} & \mathbf{o}^{\prime}=\left(\begin{array}{c} -\frac{f_{c a m}}{W / 2} \frac{o_x}{o_z} \\ -\frac{f_{c a m}}{H / 2} \frac{o_y}{o_z} \\ 1+\frac{2 n}{o_z} \end{array}\right) \\ & \mathbf{d}^{\prime}=\left(\begin{array}{c} -\frac{f_{c a m}}{W / 2}\left(\frac{d_x}{d_z}-\frac{o_x}{o_z}\right) \\ -\frac{f_{c a m}}{H / 2}\left(\frac{d_y}{d_z}-\frac{o_y}{o_z}\right) \\ -2 n \frac{1}{o_z} \end{array}\right) \end{aligned}
o′=−W/2fcamozox−H/2fcamozoy1+oz2nd′=−W/2fcam(dzdx−ozox)−H/2fcam(dzdy−ozoy)−2noz1 我们在 NeRF 中使用的最后一个技巧是,我们将 o 移动到光线与近平面的交点处
z
=
−
n
z=-n
z=−n (在此 NDC 转换之前) 通过采取
o
n
=
o
+
t
n
d
\mathbf{o}_n=\mathbf{o}+t_n \mathbf{d}
on=o+tnd for
t
n
=
−
(
n
+
o
z
)
/
d
z
t_n=-\left(n+o_z\right) / d_z
tn=−(n+oz)/dz.
一旦我们进入 NDC,我们可以简单地从 0 到 1 对
t
′
t^{\prime}
t′进行线性采样,以获得原始空间中从
n
n
n到
∞
\infty
∞视差的线性采样!