Statistics Term Sheet

Monday, August 4, 2025

Core Measures

Mean (μ or x̄)

Definition: Average value of a dataset Formula:

Population: μ = Σxᵢ / N
Sample: x̄ = Σxᵢ / n Purpose: Central tendency measure Example: For data [4, 8, 13, 7], mean = 32/4 = 8

Variance (σ² or s²)

Definition: Average squared distance from the mean Formula:

Population: σ² = Σ(xᵢ-μ)² / N
Sample: s² = Σ(xᵢ-x̄)² / (n-1) Units: Original units squared Purpose: Measures spread/dispersion of data Example: If data varies ±3 from mean, variance ≈ 9

Standard Deviation (σ or s)

Definition: Square root of variance Formula:

Population: σ = √(σ²)
Sample: s = √(s²) Units: Same as original data Purpose: Interpretable measure of spread Relationship: σ = √(variance)

Relationships Between Variables

Covariance (Cov(X,Y))

Definition: Measures how two variables change together Formula: Cov(X,Y) = Σ(xᵢ-x̄)(yᵢ-ȳ) / (n-1) Units: Units of X × Units of Y Range: -∞ to +∞ Interpretation:

Positive: Variables increase together
Negative: One increases as other decreases
Zero: No linear relationship

Correlation (r or ρ)

Definition: Standardized measure of linear relationship Formula: r = Cov(X,Y) / (σₓ × σᵧ) Units: Dimensionless Range: -1 to +1 Interpretation:

+1: Perfect positive linear relationship
-1: Perfect negative linear relationship
0: No linear relationship
±0.7: Strong relationship
±0.3: Weak relationship

Standardization

Z-Scores

Definition: Number of standard deviations from the mean Formula: Z = (x - μ) / σ Units: Standard deviations Range: Typically -3 to +3 Purpose:

Standardize different scales
Compare across datasets
Identify outliers (|Z| > 2 or 3) Interpretation: Z = 2 means “2 standard deviations above mean”

Covariance Matrices

Definition: Square matrix containing variances and covariances of multiple variables Structure:

Size: n×n matrix for n variables
Symmetric: Cov(Xᵢ,Xⱼ) = Cov(Xⱼ,Xᵢ)
Positive semi-definite: All eigenvalues ≥ 0 Formula: For variables X₁, X₂, …, Xₙ

     [Var(X₁)   Cov(X₁,X₂) ... Cov(X₁,Xₙ)]
S =  [Cov(X₂,X₁)  Var(X₂)  ... Cov(X₂,Xₙ)]
     [    ...        ...    ...     ...   ]
     [Cov(Xₙ,X₁) Cov(Xₙ,X₂) ...  Var(Xₙ) ]

Uses:

PCA analysis
Multivariate statistics
Portfolio risk analysis
Machine learning feature relationships

Matrix Elements

Diagonal Elements

Definition: Elements where row index = column index In Covariance Matrix: Cov(Xᵢ,Xᵢ) = Var(Xᵢ) In Correlation Matrix: Corr(Xᵢ,Xᵢ) = 1 Purpose: Self-relationships (variances or perfect correlation) Example: In 2×2 matrix, positions (1,1) and (2,2)

Off-Diagonal Elements

Definition: Elements where row index ≠ column index In Covariance Matrix: Cov(Xᵢ,Xⱼ) where i≠j In Correlation Matrix: Corr(Xᵢ,Xⱼ) where i≠j Purpose: Cross-relationships between different variables Example: In 2×2 matrix, positions (1,2) and (2,1)

Linear Algebra Concepts

Eigenvectors (v)

Definition: “Special directions” where the matrix only stretches/shrinks the vector, never rotates it Mathematical: Solutions to Av = λv Properties:

Direction vectors that remain unchanged under matrix transformation
Only magnitude changes, not direction
Normalized to unit length (||v|| = 1) In PCA: Point in directions of maximum/minimum variance

Eigenvalues (λ)

Definition: “How much stretching/shrinking” happens in each special direction Mathematical: Scalar values that satisfy Av = λv Interpretation:

λ > 1: Vector gets stretched (amplified)
0 < λ < 1: Vector gets shrunk
λ < 0: Vector gets flipped and scaled
λ = 0: Vector collapses to zero In PCA: Measure amount of variance in each principal direction

Eigenvalue-Eigenvector Relationship

Fundamental Equation: Av = λv Process:

Find eigenvalues by solving: det(A - λI) = 0
For each λ, find eigenvector by solving: (A - λI)v = 0 Result: Each eigenvalue has corresponding eigenvector In PCA: Largest eigenvalue gives first principal component direction Key Insight: The “natural coordinate system” is simply the data’s preferred viewing angle - the orientation that captures maximum variance with minimum dimensions.

Mathematical Foundation

THEOREM (provable): “Eigenvectors of symmetric matrices corresponding to distinct eigenvalues are orthogonal”

This theorem is part of the Spectral Theorem for Symmetric Matrices, which states:

Every symmetric matrix can be diagonalized by orthogonal eigenvectors
This is a fundamental result in linear algebra

Implication for PCA: Since covariance matrices are always symmetric, orthogonal principal components are mathematically guaranteed.

Matrix Examples

2×2 Covariance Matrix

[Var(X)    Cov(X,Y)]  ← Diagonal: variances
[Cov(X,Y)  Var(Y)  ]  ← Off-diagonal: covariances

2×2 Correlation Matrix

[1      Corr(X,Y)]  ← Diagonal: always 1
[Corr(X,Y)    1   ]  ← Off-diagonal: correlations

Key Relationships

Variance to Standard Deviation: σ = √(σ²)
Covariance to Correlation: r = Cov(X,Y) / (σₓ × σᵧ)
Raw to Standardized: Z = (x - μ) / σ
Matrix Symmetry: Cov(X,Y) = Cov(Y,X)

Quick Reference

Measure	Range	Units	Purpose
Mean	Any	Original	Central tendency
Variance	0 to ∞	Original²	Spread
Std Dev	0 to ∞	Original	Interpretable spread
Covariance	-∞ to ∞	X×Y units	Raw relationship
Correlation	-1 to +1	None	Standardized relationship
Z-Score	-∞ to ∞	Std devs	Standardized position
Eigenvalues	0 to ∞*	Variance units	Variance in PC direction
Eigenvectors	Unit length	Dimensionless	Principal directions
Covariance Matrix	Symmetric	Mixed units	All variable relationships

*For covariance matrices (positive semi-definite)

Learning