GAMES101_notes_01

2 Jan 2023

GAMES101: 现代计算机图形学入门

笔记01：Introduction, Transformation, Rasterization

Lesson 02 Review of Linear Algebra 向量与线性代数

图形学依赖

基础数学：线性代数微积分统计学；
基础物理学：光学力学；
其他：信号处理数值分析

线性代数

向量矩阵。。。

Lesson 03 Transformation 变换

为什么研究变换

Modeling 模型变换
Viewing 视角变换（从三维到二维的）投影

广泛应用

二维变换：旋转放缩裁剪

缩放变换(Scale)：
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]
反射变换(Reflection):
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]
切变(Shear)
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}1 & a \\ 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]
旋转变换(Rotate)：默认关于原点逆时针旋转。
\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]

以上变换的共同点：线性变换（可以通过矩阵来表示）：

\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}a & b \\ c & d\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}\]

齐次坐标

平移变换(Transition)：无法通过上述形式的线性变换矩阵来表示。

\[\begin{bmatrix}x'\\ y'\end{bmatrix} = \begin{bmatrix}a & b \\ c & d\end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix} + \begin{bmatrix}t_x\\ t_y\end{bmatrix}\]

由此引入了 齐次坐标(Homogenous Coordinates) 这个概念：对于二维向量和二维点，增加第三个分量($w$-coordinate)

二维点： $(x, y, 1)^\mathrm{T}$
二维向量： $(x, y, 0)^\mathrm{T}$

由此平移操作可以用一个矩阵乘法的形式表示：

\[\begin{bmatrix}x'\\ y' \\ w'\end{bmatrix} = \begin{bmatrix}1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix}x\\ y \\ w\end{bmatrix}\]

向量的$w$分量为$0$的原因：因为向量具有 平移不变性

vector + vector = vector
point - point = vector
point + vector = point
point + point = ？？

$ \begin{pmatrix}x\ y \ w\end{pmatrix} $ 对应二维点 $ \begin{pmatrix}x/w\ y/w \ 1\end{pmatrix}, w \ne 0 $

由此两个齐次坐标点的相加可以视为两点的 加权平均

仿射变换

\[\begin{pmatrix}x'\\ y' \\ 1\end{pmatrix} = \begin{pmatrix}a & b & t_x \\ c & d & t_y \\ 0 & 0 & 1\end{pmatrix} \begin{pmatrix}x\\ y \\ 1\end{pmatrix}\]

Scale:

\[S(s_x, s_y) = \begin{pmatrix}s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1\end{pmatrix}\]

Rotation:

\[R(\alpha) = \begin{pmatrix}\cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix}\]

Translation:

\[T(t_x, t_y) = \begin{pmatrix}1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1\end{pmatrix}\]

逆变换、变换组合、变换分解

\[A_n(\dots A_2(A_1 (x))) = \mathbf{A}_n \cdots \mathbf{A}_2\cdot \mathbf{A}_1 \cdot \begin{pmatrix}x \\ y \\ 1\end{pmatrix}\]

复杂变换的分解：

如何关于固定点$c$旋转？

将图形平移$-c$；
旋转
平移$c$

\[\mathbf{M} = \mathbf{T}(c)\cdot \mathbf{R}(\alpha) \cdot \mathbf{T}(-c)\]

三维变换

仍然利用齐次坐标系统：

三维点坐标： $(x, y, z, 1)^\mathrm{T}$
三维向量坐标： $(x, y, z, 0)^\mathrm{T}$

定义$(x, y, z, w)^\mathrm{T}(w\ne 0)$表示三维空间中点 $(x/w, y/w, z/w)^\mathrm{T}$

三维空间中仿射变换矩阵： $\begin{pmatrix}x'\\ y' \\ z'\\ 1\end{pmatrix} = \begin{pmatrix}a & b & c & t_x \\ d & e & f & t_y \\ g & h & i & t_z \\ 0 & 0 & 0 & 1\end{pmatrix} \begin{pmatrix}x\\ y \\ z\\ 1\end{pmatrix}$ 上述形式的仿射变换矩阵可以视作 先进行线性变换再进行平移变换

Lesson 04 Transformation Cont

对上节课的补充：旋转变换的逆变换矩阵形式

\[R(\alpha) = \begin{pmatrix}\cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix}\] \[R(-\alpha) = \begin{pmatrix}\cos \alpha & \sin \alpha & 0 \\ -\sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1\end{pmatrix} = R(\alpha)^{-1} = R(\alpha)^\mathrm{T}\]

三维坐标系下的变换

三维点坐标： $(x, y, z, 1)^\mathrm{T}$
三维向量坐标： $(x, y, z, 0)^\mathrm{T}$

Scale $\mathbf{S}(s_x, s_y, s_z)= \begin{pmatrix}s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}$

Translation $\mathbf{T}(t_x, t_y, t_z)= \begin{pmatrix}1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1\end{pmatrix}$

Rotation

\[\mathbf{R}_x(\alpha)= \begin{pmatrix}1 & 0 & 0 & 0 \\ 0 & \cos\alpha & -\sin\alpha & 0 \\ 0 & \sin\alpha & \cos\alpha & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\] \[\mathbf{R}_y(\alpha)= \begin{pmatrix}\cos\alpha & 0 & \sin\alpha & 0 \\ 0 & 1 & 0 & 0 \\ -\sin\alpha & 0 & \cos\alpha & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\] \[\mathbf{R}_z(\alpha)= \begin{pmatrix}\cos\alpha & -\sin\alpha & 0 & 0 \\ \sin\alpha & \cos\alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}\]

如何考虑复杂的三维旋转变换？

\[\mathbf{R}_{xyz}(\alpha, \beta, \gamma) = \mathbf{R}_x(\alpha)\mathbf{R}_y(\beta)\mathbf{R}_z(\gamma)\]

$\alpha, \beta, \gamma$被称为欧拉角

Rodrigues’ Rotation Formula

Rotation by angle $\alpha$ around axis $\mathbf{n}$

\[\mathbf{R}(\mathbf{n}, \alpha) = \cos(\alpha) \mathbf{I} + (1-\cos(\alpha))\mathbf{n}\mathbf{n}^\mathrm{T} + \sin \alpha \begin{pmatrix}0 & -n_z & n_y \\ n_z & 0 & -n_x \\ -n_y & n_x & 0\end{pmatrix}\]

Viewing transformation

View/Camera transformation 观测变换/视图变换
Projection transformation 投影变换
- Orthographic projection 正交
- Perspective projection 透视

如何拍一张照片？MVP Transformations

model transformation 模型变换
view transformation 视图变换
projection transformation 投影变换

如何确定相机位置？

Position $\vec{e}$
Look-at/ gaze direction $\hat{g}$
Up direction $\hat{t}$

约定：相机始终位于原点，永远向$-z$方向看，永远以$y$轴为固定向上方向，改变其他物体的位置而不改变相机的标准位置

在数学上通过矩阵变换将相机移到标准位置：

将位置 $\vec{e}$ 移到原点;
旋转 $\hat{g}$ 到 $-z$ 方向；
旋转 $\hat{t}$ 到 $y$ 方向。

进行矩阵变换则为：

\[\mathbf{M}_{view} = \mathbf{R}_{view}\mathbf{T}_{view}\]

其中

\[\mathbf{T}_{view}= \begin{pmatrix} 1 & 0 & 0 & -x_e \\ 0 & 1 & 0 & -y_e \\ 0 & 0 & 1 & -z_e \\ 0 & 0 & 0 & 1 \end{pmatrix}\]
由于对于 $\mathbf{R}_{view}$ 的逆矩阵，是将 $x$ 轴旋转到 $\hat{g}\times \hat{t}$，将 $y$ 轴旋转到 $\hat{t}$ ，将 $z$ 轴旋转到 $-\hat{g}$ 上，故 $\mathbf{R}_{view}^{-1}= \begin{pmatrix} x_{\hat{g}\times\hat{t}} & x_{\hat{t}} & x_{-\hat{g}} & 0 \\ y_{\hat{g}\times\hat{t}} & y_{\hat{t}} & y_{-\hat{g}} & 0 \\ z_{\hat{g}\times\hat{t}} & z_{\hat{t}} & z_{-\hat{g}} & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}$
因此 $\mathbf{R}_{view}= (\mathbf{R}_{view}^{-1})^\mathrm{T} = \begin{pmatrix} x_{\hat{g}\times\hat{t}} & y_{\hat{g}\times\hat{t}} & z_{\hat{g}\times\hat{t}} & 0 \\ x_{\hat{t}} & y_{\hat{t}} & z_{\hat{t}} & 0 \\ x_{-\hat{g}} & y_{-\hat{g}} & z_{-\hat{g}} & 0 \\ 0 & 0 & 0 & 1\end{pmatrix}$

Projection Transformation

正交投影
透视投影

正交投影

简单理解正交投影：

将照相机固定在原点，看向 $-z$ 方向，上方为 $y$ 方向；
抛掉所有 $z$ 方向坐标；
平移和放缩结果的矩形画面到 $[-1, 1]^2$上。

约定俗成的方法：

将包裹全部物体的长方体 $[l, r]\times[b, t]\times[f, n]$ 映射到标准立方体(canonical cube) $[-1, 1]^3$ ，先做平移，将中心移到原点之后再进行放缩。
变换矩阵为： $\mathbf{M}_{ortho} = \begin{pmatrix} \frac{2}{r - l} & 0 & 0 & 0 \\ 0 & \frac{2}{t - b} & 0 & 0 \\ 0 & 0 & \frac{2}{n - f} & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 & -\frac{r + l}{2} \\ 0 & 1 & 0 & -\frac{t + b}{2} \\ 0 & 0 & 1 & -\frac{n + f}{2} \\ 0 & 0 & 0 & 1 \end{pmatrix}$

透视投影

回顾数学基础：

对于齐次坐标系， $(x, y, z, 1)$ 和 $(kx, ky, kz, k), k \ne 0$ 都表示三维空间中 $(x, y, z)$ 这个点

如何做透视投影：

首先，将棱台的远平面压缩成和近平面相同大小的长方形 ($\mathbf{M_{persp\rightarrow ortho}}$)
将压缩后的远平面通过正交投影 ($\mathbf{M}_{ortho}$)

如何求$\mathbf{M}_{persp\rightarrow ortho}$？

根据相似三角形的理论，如果近平面到原点的距离是 $n$，那么点 $(x, y, z)$ 通过相似变换映射到近平面上的点 $(x’, y’, z’)$ 应当满足 $y' = \frac{n}{z}y,\quad x' = \frac{n}{z} x.$ 由于这样的相似变换总是会非线性的改变在棱台内点的 $z$ 分量，因此我们希望能够在这些变换中找到满足如下两点性质的一类变换：

对位于近平面，即 $z = n$ 的所有点，在相似变换后仍然保持 $z’ = n$；
对位于远平面，即 $z = f$ 的所有点，在相似变换后仍然保持 $z’ = f$.

由此我们可以通过逐行凑矩阵的方式，推得透视投影变换矩阵 $\mathbf{M}_{persp\rightarrow ortho}$ 应当为

\[\mathbf{M}_{persp\rightarrow ortho} = \begin{pmatrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & -nf\\ 0 & 0 & 1 & 0 \end{pmatrix}\]

可以验证

\[\mathbf{M}_{persp\rightarrow ortho} \begin{pmatrix} x \\ y \\ n \\ 1 \end{pmatrix} = \begin{pmatrix} nx \\ ny \\ n^2 \\ n \end{pmatrix} \Leftrightarrow \begin{pmatrix} x \\ y \\ n \\ 1 \end{pmatrix}\] \[\mathbf{M}_{persp\rightarrow ortho} \begin{pmatrix} x \\ y \\ f \\ 1 \end{pmatrix} = \begin{pmatrix} nx \\ ny \\ f^2 \\ f \end{pmatrix} \Leftrightarrow \begin{pmatrix} \frac{nx}{f} \\ \frac{ny}{f} \\ f \\ 1 \end{pmatrix}\]

满足我们最开始的要求。

最后，我们结合正交投影部分的结论，可以推出

\[\mathbf{M}_{persp} = \mathbf{M}_{othro} \mathbf{M}_{persp\rightarrow ortho}\]

Lesson 05 Rasterization 光栅化 1(Triangles)

Viewport transformation
Rasterization
- Different raster displays
- Rasterizing a triangle

Viewport transformation

关于视锥的更多定义：

近平面的宽高比 aspect ratio = width/height
垂直可视角度 vertical field-of-view(fovY) 近平面上下边中点和原点张成的角度 $\tan \frac{fovY}{2} =\frac{t}{|n|}\\ aspect = \frac{r}{t}$

What’s after MVP? -> $[-1, 1]^3$的立方体

From Canonical Cube to Screen: Rasterization(光栅化)

What is a screen?

An array of pixels
Size of the array: resolution
A typical kind of raster display

Raster = screen in German

Rasterize = drawing onto the screen

Pixel (FYI, short for “picture element”)

For now: A pixel is a little square with uniform color
Color is a mixture of (red, green, blue)

定义屏幕空间：左下角为原点，向右为$x$轴正方向，向上为$y$轴正方向，并做如下约定：

像素坐标由 $(x, y)$ 的形式组成，其中 $x, y$ 均为整数；
像素坐标取值从 $(0, 0)$ 到 $(width - 1, height - 1)$;
每颗像素 $(x, y)$ 的中心位于 $(x + 0.5, y + 0.5)$；
屏幕覆盖范围是 $[0, width]\times [0, height]$

接下来如何处理？

对于 $z$ 方向的值：先不管；
对于 $x-y$ 平面：将平面 $[-1, 1]^2$ 映射到 $[0, width]\times [0, height]$

视口变换

\[\mathbf{M}_{viewport} = \begin{pmatrix} \frac{width}{2} & 0 & 0 & \frac{width}{2} \\ 0 & \frac{height}{2} & 0 & \frac{height}{2} \\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}\]

接下来：通过光栅化，将三角形转化成像素

三角形：基本图形元素

Why triangles?

Most basic polygon
- Can break up other polygons
Unique properties
- Guaranteed to be planar
- Well-defined interior
- Well-defined method for interpolating values at vertices over triangle (barycentric interpoation)

What Pixel Values Arrpoximate a Triangle?

Input: position of triangle vertices projected on screen
Output: set of pixel values approximating triangle

A simple approach: Sampling(采样)

Sampling is a core idea in graphics.

We sample time(1D), area(2D), directin(2D), volume(3D)…

Sample If Each Pixel Center is Inside Triangle

\[inside(t, x, y) = \left\{ \begin{array}{lr} 1 & \mathrm{Point(x, y)\ is\ in\ triangle\ t}\\ 0 & \mathrm{else} \end{array} \right.\]

for (int x = 0; x < xmax; ++x)
    for(int y = 0; y < ymax; ++y)
        image[x][y] = inside(tri, x + 0.5, y + 0.5);

应当如何实现 $inside(tri, x, y)$ ? -> 利用向量叉积。

将三角形三个顶点记为A、B、C，待测试点记为P；
分别测试AB和BP、BC和CP、CA和AP的叉积；
如果三个结果符号相同，则认为P在三角形ABC内；否则认为P在三角形ABC外。

进一步的优化

(Axis-Aligned )Bounding Box(包围盒，AABB)，可以减少检测像素的数量。
Incremental Triangle Traversal：可以优化应用AABB效果不好的情况。

光栅化中的一个问题：Aliasing(走样)–Jaggies(锯齿)

Lesson 06 Rasterization 2 (Antialiasing and Z-buffering)

采样理论

Sampling is Ubiquitous in Computer Graphics
- Rasterization = Sample 2D Positions
- Photograph = Sample Image Sensor Plane
- Video = Sample Time
Sampling Artifacts: (Errors/Mistakes/Inaccuracies) in Computer Graphics
- Jaggies(Staircase Pattern)
- Moire Patterns in Imaging
- Wagon wheel effect(False Motion)
Behind the Aliasing Artifacts: Signals are changing too fast(high frequency) but sampled too slowly.

Antialiasing Idea: Blurring(Pre-Filtering) Before Sampling

举例：对三角形光栅化的时候，在采样之前对边缘进行模糊，之后再进行采样，可以使得边缘更加平滑。

Blurred aliasing: Sample then filter -> WRONG!

Filter then sample -> RIGHT!

Frequency Domain

Fourier Transform

\[F(\omega) = \int_{-\infty}^{\infty} f(x) e^{-2\pi i \omega x}dx\]

Inverse Fourier Transform

\[f(x) = \int_{-\infty}^{\infty}F(\omega)e^{2\pi i \omega x}d\omega\]

Undersampling Creates Frequency Aliases. High-frequency signal is insufficiently sampled: samples erroneously appear to be from a low-frequency signal.

Two frequencies that are indistinguishable at a given sampling rate are called “aliass”.

Filtering = Getting rid of certain frequency contents

Filtering = Convolution (= Averaging)

Convolution Theorem

Convolution in the spatial domain is equal to multiplication in the frequency domain, and vice versa

Sampling = Repeating Frequency Contents

Aliasing = Mixed Frequency Contents

How Can We Reduce Aliasing Error?

Option 1: Increase samplint rate
- Essentially increasing the distance between replicas in the Fourier domain
- Higher resolution displays, sensors, framebuffers…
- But: costly & may need very high resolution
Option 2: Antialiasing
- Making Fourier contents “narrower” before repeating
- i.e. Filtering out high frequencies before sampling

Antialiasing By Averaging Values in Pixel Area

Solution:

Convolve f(x,y) by a 1-pixel box-blur
- Recall: convolving = filtering = averaging
Then Sample at every pixel’s center

Antialiasing By Supersampling(MSAA, multisampling antialiasing)

Supersampling: Approximate the effect of the 1-pixel box filter by sampling multiple locations within a pixel and averaing their values

MSAA解决的是对信号的模糊的操作，而采样的操作其实与MSAA没有关系（并不是直接提高屏幕的分辨率）

Antialiasing Today

No free lunch!
- What’s the cost of MSAA？
Milestones
- FXAA(Fast Approximate AA) 一种后处理解决方案
- TAA(Temporal AA) 考虑随时间变化时前后帧的叠加结果
Super resolution/ Super sampling 采样率不足的问题
- From low resolution to high resolution
- Essentially still “not enough samples” problem
- DLSS(Deep Learning Super Sampling)