GAMES101_notes_01
GAMES101: 现代计算机图形学入门
笔记01:Introduction, Transformation, Rasterization
Lesson 02 Review of Linear Algebra 向量与线性代数
图形学依赖
- 基础数学:线性代数 微积分 统计学;
- 基础物理学:光学 力学;
- 其他: 信号处理 数值分析
线性代数
向量 矩阵 。。。
Lesson 03 Transformation 变换
为什么研究变换
- Modeling 模型变换
- Viewing 视角变换(从三维到二维的)投影
广泛应用
二维变换:旋转 放缩 裁剪
-
缩放变换(Scale):
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\] -
反射变换(Reflection):
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\] -
切变(Shear)
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}1 & a \\ 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\] -
旋转变换(Rotate):默认关于原点逆时针旋转。
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]
以上变换的共同点:线性变换(可以通过矩阵来表示):
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}a & b \\ c & d\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]齐次坐标
平移变换(Transition):无法通过上述形式的线性变换矩阵来表示。
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}a & b \\ c & d\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix} + \begin{bmatrix}t_x\\ t_y\end{bmatrix}\]由此引入了 齐次坐标(Homogenous Coordinates) 这个概念: 对于二维向量和二维点,增加第三个分量($w$-coordinate)
- 二维点: $(x, y, 1)^\mathrm{T}$
- 二维向量: $(x, y, 0)^\mathrm{T}$
由此平移操作可以用一个矩阵乘法的形式表示:
\[\begin{bmatrix}x'\\ y' \\ w'\end{bmatrix} = \begin{bmatrix}1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y \\ w\end{bmatrix}\]向量的$w$分量为$0$的原因:因为向量具有 平移不变性
- vector + vector = vector
- point - point = vector
- point + vector = point
- point + point = ??
$ \begin{pmatrix}x\ y \ w\end{pmatrix} $ 对应二维点 $ \begin{pmatrix}x/w\ y/w \ 1\end{pmatrix}, w \ne 0 $
由此两个齐次坐标点的相加可以视为两点的 加权平均
仿射变换
\[\begin{pmatrix}x'\\ y' \\ 1\end{pmatrix} = \begin{pmatrix}a & b & t_x \\ c & d & t_y \\ 0 & 0 & 1\end{pmatrix} \begin{pmatrix}x\\ y \\ 1\end{pmatrix}\]Scale:
\[S(s_x, s_y) = \begin{pmatrix}s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1\end{pmatrix}\]Rotation:
\[R(\alpha) = \begin{pmatrix}\cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix}\]Translation:
\[T(t_x, t_y) = \begin{pmatrix}1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1\end{pmatrix}\]逆变换、变换组合、变换分解
\[A_n(\dots A_2(A_1 (x))) = \mathbf{A}_n \cdots \mathbf{A}_2\cdot \mathbf{A}_1 \cdot \begin{pmatrix}x \\ y \\ 1\end{pmatrix}\]复杂变换的分解:
- 如何关于固定点$c$旋转?
- 将图形平移$-c$;
- 旋转
- 平移$c$
三维变换
仍然利用齐次坐标系统:
- 三维点坐标: $(x, y, z, 1)^\mathrm{T}$
- 三维向量坐标: $(x, y, z, 0)^\mathrm{T}$
定义$(x, y, z, w)^\mathrm{T}(w\ne 0)$表示三维空间中点 \((x/w, y/w, z/w)^\mathrm{T}\)
三维空间中仿射变换矩阵: \(\begin{pmatrix}x'\\ y' \\ z'\\ 1\end{pmatrix} = \begin{pmatrix}a & b & c & t_x \\ d & e & f & t_y \\ g & h & i & t_z \\ 0 & 0 & 0 & 1\end{pmatrix} \begin{pmatrix}x\\ y \\ z\\ 1\end{pmatrix}\) 上述形式的仿射变换矩阵可以视作 先进行线性变换再进行平移变换
Lesson 04 Transformation Cont
对上节课的补充:旋转变换的逆变换矩阵形式
\[R(\alpha) = \begin{pmatrix}\cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix}\] \[R(-\alpha) = \begin{pmatrix}\cos \alpha & \sin \alpha & 0 \\ -\sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix} = R(\alpha)^{-1} = R(\alpha)^\mathrm{T}\]三维坐标系下的变换
- 三维点坐标: $(x, y, z, 1)^\mathrm{T}$
- 三维向量坐标: $(x, y, z, 0)^\mathrm{T}$
Scale \(\mathbf{S}(s_x, s_y, s_z)= \begin{pmatrix}s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\)
Translation \(\mathbf{T}(t_x, t_y, t_z)= \begin{pmatrix}1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1\end{pmatrix}\)
Rotation
\[\mathbf{R}_x(\alpha)= \begin{pmatrix}1 & 0 & 0 & 0 \\ 0 & \cos\alpha & -\sin\alpha & 0 \\ 0 & \sin\alpha & \cos\alpha & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\] \[\mathbf{R}_y(\alpha)= \begin{pmatrix}\cos\alpha & 0 & \sin\alpha & 0 \\ 0 & 1 & 0 & 0 \\ -\sin\alpha & 0 & \cos\alpha & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\] \[\mathbf{R}_z(\alpha)= \begin{pmatrix}\cos\alpha & -\sin\alpha & 0 & 0 \\ \sin\alpha & \cos\alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\]如何考虑复杂的三维旋转变换?
\[\mathbf{R}_{xyz}(\alpha, \beta, \gamma) = \mathbf{R}_x(\alpha)\mathbf{R}_y(\beta)\mathbf{R}_z(\gamma)\]$\alpha, \beta, \gamma$被称为欧拉角
Rodrigues’ Rotation Formula
Rotation by angle $\alpha$ around axis $\mathbf{n}$
\[\mathbf{R}(\mathbf{n}, \alpha) = \cos(\alpha) \mathbf{I} + (1-\cos(\alpha))\mathbf{n}\mathbf{n}^\mathrm{T} + \sin \alpha \begin{pmatrix}0 & -n_z & n_y \\ n_z & 0 & -n_x \\ -n_y & n_x & 0\end{pmatrix}\]Viewing transformation
- View/Camera transformation 观测变换/视图变换
- Projection transformation 投影变换
- Orthographic projection 正交
- Perspective projection 透视
如何拍一张照片?MVP Transformations
- model transformation 模型变换
- view transformation 视图变换
- projection transformation 投影变换
如何确定相机位置?
- Position $\vec{e}$
- Look-at/ gaze direction $\hat{g}$
- Up direction $\hat{t}$
约定:相机始终位于 原点,永远向$-z$方向看,永远以$y$轴为固定向上方向,改变其他物体的位置而不改变相机的标准位置
在数学上通过矩阵变换将相机移到标准位置:
- 将位置 $\vec{e}$ 移到原点;
- 旋转 $\hat{g}$ 到 $-z$ 方向;
- 旋转 $\hat{t}$ 到 $y$ 方向。
进行矩阵变换则为:
\[\mathbf{M}_{view} = \mathbf{R}_{view}\mathbf{T}_{view}\]其中
- \[\mathbf{T}_{view}= \begin{pmatrix} 1 & 0 & 0 & -x_e \\ 0 & 1 & 0 & -y_e \\ 0 & 0 & 1 & -z_e \\ 0 & 0 & 0 & 1 \end{pmatrix}\]
- 由于对于 $\mathbf{R}_{view}$ 的逆矩阵,是将 $x$ 轴旋转到 $\hat{g}\times \hat{t}$,将 $y$ 轴旋转到 $\hat{t}$ ,将 $z$ 轴旋转到 $-\hat{g}$ 上,故 \(\mathbf{R}_{view}^{-1}= \begin{pmatrix} x_{\hat{g}\times\hat{t}} & x_{\hat{t}} & x_{-\hat{g}} & 0 \\ y_{\hat{g}\times\hat{t}} & y_{\hat{t}} & y_{-\hat{g}} & 0 \\ z_{\hat{g}\times\hat{t}} & z_{\hat{t}} & z_{-\hat{g}} & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\)
- 因此 \(\mathbf{R}_{view}= (\mathbf{R}_{view}^{-1})^\mathrm{T} = \begin{pmatrix} x_{\hat{g}\times\hat{t}} & y_{\hat{g}\times\hat{t}} & z_{\hat{g}\times\hat{t}} & 0 \\ x_{\hat{t}} & y_{\hat{t}} & z_{\hat{t}} & 0 \\ x_{-\hat{g}} & y_{-\hat{g}} & z_{-\hat{g}} & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\)
Projection Transformation
- 正交投影
- 透视投影
正交投影
简单理解正交投影:
- 将照相机固定在原点,看向 $-z$ 方向,上方为 $y$ 方向;
- 抛掉所有 $z$ 方向坐标;
- 平移和放缩结果的矩形画面到 $[-1, 1]^2$上。
约定俗成的方法:
- 将包裹全部物体的长方体 $[l, r]\times[b, t]\times[f, n]$ 映射到标准立方体(canonical cube) $[-1, 1]^3$ ,先做平移,将中心移到原点之后再进行放缩。
- 变换矩阵为: \(\mathbf{M}_{ortho} = \begin{pmatrix} \frac{2}{r - l} & 0 & 0 & 0 \\ 0 & \frac{2}{t - b} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 & -\frac{r + l}{2} \\ 0 & 1 & 0 & -\frac{t + b}{2} \\ 0 & 0 & 1 & -\frac{n + f}{2} \\ 0 & 0 & 0 & 1 \end{pmatrix}\)
透视投影
回顾数学基础:
- 对于齐次坐标系, $(x, y, z, 1)$ 和 $(kx, ky, kz, k), k \ne 0$ 都表示三维空间中 $(x, y, z)$ 这个点
如何做透视投影:
- 首先,将棱台的远平面压缩成和近平面相同大小的长方形 ($\mathbf{M_{persp\rightarrow ortho}}$)
- 将压缩后的远平面通过正交投影 ($\mathbf{M}_{ortho}$)
如何求$\mathbf{M}_{persp\rightarrow ortho}$?
根据相似三角形的理论,如果近平面到原点的距离是 $n$,那么点 $(x, y, z)$ 通过相似变换映射到近平面上的点 $(x’, y’, z’)$ 应当满足 \(y' = \frac{n}{z}y,\quad x' = \frac{n}{z} x.\) 由于这样的相似变换总是会非线性的改变在棱台内点的 $z$ 分量,因此我们希望能够在这些变换中找到满足如下两点性质的一类变换:
- 对位于近平面,即 $z = n$ 的所有点,在相似变换后仍然保持 $z’ = n$;
- 对位于远平面,即 $z = f$ 的所有点,在相似变换后仍然保持 $z’ = f$.
由此我们可以通过逐行凑矩阵的方式,推得透视投影变换矩阵 $\mathbf{M}_{persp\rightarrow ortho}$ 应当为
\[\mathbf{M}_{persp\rightarrow ortho} = \begin{pmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & -nf\\ 0 & 0 & 1 & 0 \end{pmatrix}\]可以验证
\[\mathbf{M}_{persp\rightarrow ortho} \begin{pmatrix} x \\ y \\ n \\ 1 \end{pmatrix} = \begin{pmatrix} nx \\ ny \\ n^2 \\ n \end{pmatrix} \Leftrightarrow \begin{pmatrix} x \\ y \\ n \\ 1 \end{pmatrix}\] \[\mathbf{M}_{persp\rightarrow ortho} \begin{pmatrix} x \\ y \\ f \\ 1 \end{pmatrix} = \begin{pmatrix} nx \\ ny \\ f^2 \\ f \end{pmatrix} \Leftrightarrow \begin{pmatrix} \frac{nx}{f} \\ \frac{ny}{f} \\ f \\ 1 \end{pmatrix}\]满足我们最开始的要求。
最后,我们结合正交投影部分的结论,可以推出
\[\mathbf{M}_{persp} = \mathbf{M}_{othro} \mathbf{M}_{persp\rightarrow ortho}\]Lesson 05 Rasterization 光栅化 1(Triangles)
- Viewport transformation
- Rasterization
- Different raster displays
- Rasterizing a triangle
Viewport transformation
关于视锥的更多定义:
- 近平面的宽高比 aspect ratio = width/height
- 垂直可视角度 vertical field-of-view(fovY) 近平面上下边中点和原点张成的角度 \(\tan \frac{fovY}{2} =\frac{t}{|n|}\\ aspect = \frac{r}{t}\)
What’s after MVP? -> $[-1, 1]^3$的立方体
From Canonical Cube to Screen: Rasterization(光栅化)
What is a screen?
- An array of pixels
- Size of the array: resolution
- A typical kind of raster display
Raster = screen in German
- Rasterize = drawing onto the screen
Pixel (FYI, short for “picture element”)
- For now: A pixel is a little square with uniform color
- Color is a mixture of (red, green, blue)
定义屏幕空间: 左下角为原点,向右为$x$轴正方向,向上为$y$轴正方向,并做如下约定:
- 像素坐标由 $(x, y)$ 的形式组成,其中 $x, y$ 均为整数;
- 像素坐标取值从 $(0, 0)$ 到 $(width - 1, height - 1)$;
- 每颗像素 $(x, y)$ 的中心位于 $(x + 0.5, y + 0.5)$;
- 屏幕覆盖范围是 $[0, width]\times [0, height]$
接下来如何处理?
- 对于 $z$ 方向的值:先不管;
- 对于 $x-y$ 平面:将平面 $[-1, 1]^2$ 映射到 $[0, width]\times [0, height]$
视口变换
\[\mathbf{M}_{viewport} = \begin{pmatrix} \frac{width}{2} & 0 & 0 & \frac{width}{2} \\ 0 & \frac{height}{2} & 0 & \frac{height}{2} \\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}\]接下来:通过光栅化,将三角形转化成像素
三角形:基本图形元素
Why triangles?
- Most basic polygon
- Can break up other polygons
- Unique properties
- Guaranteed to be planar
- Well-defined interior
- Well-defined method for interpolating values at vertices over triangle (barycentric interpoation)
What Pixel Values Arrpoximate a Triangle?
- Input: position of triangle vertices projected on screen
- Output: set of pixel values approximating triangle
A simple approach: Sampling(采样)
Sampling is a core idea in graphics.
- We sample time(1D), area(2D), directin(2D), volume(3D)…
Sample If Each Pixel Center is Inside Triangle
\[inside(t, x, y) = \left\{ \begin{array}{lr} 1 & \mathrm{Point(x, y)\ is\ in\ triangle\ t}\\ 0 & \mathrm{else} \end{array} \right.\]for (int x = 0; x < xmax; ++x)
for(int y = 0; y < ymax; ++y)
image[x][y] = inside(tri, x + 0.5, y + 0.5);
应当如何实现 $inside(tri, x, y)$ ? -> 利用向量叉积。
- 将三角形三个顶点记为A、B、C,待测试点记为P;
- 分别测试AB和BP、BC和CP、CA和AP的叉积;
- 如果三个结果符号相同,则认为P在三角形ABC内;否则认为P在三角形ABC外。
进一步的优化
- (Axis-Aligned )Bounding Box(包围盒,AABB),可以减少检测像素的数量。
- Incremental Triangle Traversal:可以优化应用AABB效果不好的情况。
光栅化中的一个问题:Aliasing(走样)–Jaggies(锯齿)
Lesson 06 Rasterization 2 (Antialiasing and Z-buffering)
采样理论
- Sampling is Ubiquitous in Computer Graphics
- Rasterization = Sample 2D Positions
- Photograph = Sample Image Sensor Plane
- Video = Sample Time
- Sampling Artifacts: (Errors/Mistakes/Inaccuracies) in Computer Graphics
- Jaggies(Staircase Pattern)
- Moire Patterns in Imaging
- Wagon wheel effect(False Motion)
- Behind the Aliasing Artifacts: Signals are changing too fast(high frequency) but sampled too slowly.
Antialiasing Idea: Blurring(Pre-Filtering) Before Sampling
举例:对三角形光栅化的时候,在采样之前对边缘进行模糊,之后再进行采样,可以使得边缘更加平滑。
Blurred aliasing: Sample then filter -> WRONG!
Filter then sample -> RIGHT!
Frequency Domain
Fourier Transform
\[F(\omega) = \int_{-\infty}^{\infty} f(x) e^{-2\pi i \omega x}dx\]Inverse Fourier Transform
\[f(x) = \int_{-\infty}^{\infty}F(\omega)e^{2\pi i \omega x}d\omega\]Undersampling Creates Frequency Aliases. High-frequency signal is insufficiently sampled: samples erroneously appear to be from a low-frequency signal.
Two frequencies that are indistinguishable at a given sampling rate are called “aliass”.
Filtering = Getting rid of certain frequency contents
Filtering = Convolution (= Averaging)
Convolution Theorem
Convolution in the spatial domain is equal to multiplication in the frequency domain, and vice versa
Sampling = Repeating Frequency Contents
Aliasing = Mixed Frequency Contents
How Can We Reduce Aliasing Error?
- Option 1: Increase samplint rate
- Essentially increasing the distance between replicas in the Fourier domain
- Higher resolution displays, sensors, framebuffers…
- But: costly & may need very high resolution
- Option 2: Antialiasing
- Making Fourier contents “narrower” before repeating
- i.e. Filtering out high frequencies before sampling
Antialiasing By Averaging Values in Pixel Area
Solution:
- Convolve f(x,y) by a 1-pixel box-blur
- Recall: convolving = filtering = averaging
- Then Sample at every pixel’s center
Antialiasing By Supersampling(MSAA, multisampling antialiasing)
Supersampling: Approximate the effect of the 1-pixel box filter by sampling multiple locations within a pixel and averaing their values
MSAA解决的是对信号的模糊的操作,而采样的操作其实与MSAA没有关系(并不是直接提高屏幕的分辨率)
Antialiasing Today
- No free lunch!
- What’s the cost of MSAA?
- Milestones
- FXAA(Fast Approximate AA) 一种后处理解决方案
- TAA(Temporal AA) 考虑随时间变化时前后帧的叠加结果
- Super resolution/ Super sampling 采样率不足的问题
- From low resolution to high resolution
- Essentially still “not enough samples” problem
- DLSS(Deep Learning Super Sampling)