CS2900 Revision Notes

Multi-dimensional Data Processing Revision Notes


Section 1: Vectors & Matrices Basics

1.1 Vector Operations

Dot Product:
For vectors ( \underline{u} = (u_1, u_2, …, u_n) ) and ( \underline{v} = (v_1, v_2, …, v_n) ):
[ \underline{u} \cdot \underline{v} = u_1v_1 + u_2v_2 + … + u_nv_n ]
Example: For ( \underline{u} = \begin{pmatrix}1\2\7\-3\end{pmatrix} ) and ( \underline{v} = \begin{pmatrix}-4\2\1\1\end{pmatrix} ):
[ \underline{u} \cdot \underline{v} = (1)(-4) + (2)(2) + (7)(1) + (-3)(1) = -4 + 4 + 7 - 3 = 4 ]

Vector Norm (Length):
[ ||\underline{u}|| = \sqrt{u_1^2 + u_2^2 + … + u_n^2} ]

Cosine Similarity:
[ \cos\theta = \frac{\underline{u} \cdot \underline{v}}{||\underline{u}|| \cdot ||\underline{v}||} ]
Used to measure the angle between two vectors.
Example: For ( \underline{u} = \begin{pmatrix}1\1\1\\sqrt{2}\\sqrt{2}\end{pmatrix} ) and ( \underline{v} = \begin{pmatrix}2\-1\0\0\0\end{pmatrix}):
[ ||\underline{u}|| = \sqrt{1^2 + 1^2 + 1^2 + (\sqrt{2})^2 + (\sqrt{2})^2} = \sqrt{8} ]
[ \underline{u} \cdot \underline{v} = 2 -1 + 0 + 0 + 0 = 1 ]


1.2 Matrix Terminology

Diagonal Matrix:
Square matrix where all non-diagonal entries are zero.
Example: ( \mathbf{D} = \begin{pmatrix}d_1 & 0\0 & d_2\end{pmatrix} ).

Symmetric Matrix:
Matrix equals its transpose: ( \mathbf{A} = \mathbf{A}^\intercal ).

Orthogonal Matrix:
Columns are orthonormal vectors: ( \mathbf{Q}^\intercal\mathbf{Q} = \mathbf{I} ).
Key property: Preserves vector lengths, i.e., ( ||\mathbf{Q}\underline{u}|| = ||\underline{u}|| ).

Rank of Matrix:
Maximum number of linearly independent rows/columns.


Section 2: Singular Value Decomposition (SVD)

2.1 Definition

For an ( M \times N ) matrix ( \mathbf{B} ), SVD is:
[ \mathbf{B} = \mathbf{U}\mathbf{D}\mathbf{V}^\intercal ]

  • ( \mathbf{U} ): ( M \times M ) orthogonal matrix (left singular vectors).
  • ( \mathbf{D} ): ( M \times N ) diagonal matrix with singular values ( d_i \geq 0 ).
  • ( \mathbf{V}^\intercal ): ( N \times N ) orthogonal matrix (right singular vectors).

2.2 Pseudo-Inverse

For matrix ( \mathbf{B} ), if ( \mathbf{B}^\dagger ) is its pseudo-inverse:
[ \mathbf{B}^\dagger = \mathbf{V}\mathbf{D}^\dagger\mathbf{U}^\intercal ]
Where ( \mathbf{D}^\dagger ) replaces non-zero singular values ( d_i ) with ( 1/d_i ).


Section 3: PageRank Algorithm

3.1 Key Concepts

Adjacency Matrix (A):

  • Represents directed links between web pages.
  • ( A_{ij} = 1 ) if page ( i ) links to page ( j ); 0 otherwise.

Diffusion Matrix (P):
[ P_{ij} = \frac{A_{ij}}{\text{deg}(i)} ]
Where ( \text{deg}(i) ) = number of outgoing links from page ( i ).

Teleportation Adjustment:
To handle dangling nodes (pages with no outgoing links):
[ \mathbf{P}’’ = c\mathbf{P}’ + (1 - c)\mathbf{v}\mathbf{e}^\intercal ]

  • ( c ): Damping factor (typical 0.85).
  • ( \mathbf{v} = (1/n, …, 1/n) ): Uniform distribution vector.

3.2 Power Method Steps

  1. Initialize ( \mathbf{x}^{(0)} = \mathbf{v} ).
  2. Iterate until convergence:
    [ \mathbf{x}^{(k+1)} = c\mathbf{P}\mathbf{x}^{(k)} + (1 - c)\mathbf{v} ]
  3. Check stability using ( ||\mathbf{x}^{(k+1)} - \mathbf{x}^{(k)}|| < \epsilon ).

Section 4: Vector Projections & Orthogonality

4.1 Projection Formula

The projection of ( \underline{u} ) onto ( \underline{v} ):
[ \text{proj}_{\underline{v}}(\underline{u}) = \frac{\underline{u} \cdot \underline{v}}{||\underline{v}||^2}\underline{v} ]

4.2 Orthogonal Component

Vector ( \underline{w} = \underline{u} - \text{proj}_{\underline{v}}(\underline{u}) ) is orthogonal to ( \underline{v} ).
Property: ( \underline{w} \cdot \underline{v} = 0 ).


Section 5: Graph Theory & Adjacency Matrices

5.1 Directed vs. Undirected Graphs

  • Directed: Links have direction (adjacency matrix asymmetric).
  • Undirected: Links are bidirectional (adjacency matrix symmetric).

5.2 Path Counting

Entries in ( \mathbf{A}^k ) give number of paths of length ( k ) between nodes.

Example: For adjacency matrix ( \mathbf{A} ):

  • Paths of length 2: Non-zero entries in ( \mathbf{A}^2 ).
  • Paths of length 3: Non-zero entries in ( \mathbf{A}^3 ).

Section 6: Python & Special Matrix Classes

6.1 specialMatrix Structure

  • Purpose: Efficiently store structured matrices (e.g., diagonal, triangular).
  • Function func1: Likely performs matrix-vector multiplication optimized for sparsity.
  • Trace Calculation: Sum of diagonal elements.
    1
    2
    def trace(self):
    return sum(self.diagonal_elements)

6.2 Advantages of Custom Classes

  • Memory efficiency for sparse matrices.
  • Faster computations by avoiding unnecessary zero operations.

Section 7: Covariance & SVD in Data

7.1 Covariance Matrix

For centered data matrix $\mathbf{X}$$$\mathbf{C}_X = \frac{1}{n-1}\mathbf{X}^\intercal\mathbf{X}$$
Decomposition via SVD$$\mathbf{C}_X = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^\intercal$$

  • Columns of $\mathbf{V}$ are principal directions.
  • $\mathbf{\Lambda}$ contains eigenvalues (variances).

CS2900 Revision Notes
https://blog.pandayuyu.zone/2025/05/15/CS2900-Revision-Notes/
Author
Panda
Posted on
May 15, 2025
Licensed under