Statistics Term Sheet
Core Measures
Mean (μ or x̄)
Definition: Average value of a dataset Formula:
- Population:
μ = Σxᵢ / N
- Sample:
x̄ = Σxᵢ / n
Purpose: Central tendency measure Example: For data [4, 8, 13, 7], mean = 32/4 = 8
Variance (σ² or s²)
Definition: Average squared distance from the mean Formula:
- Population:
σ² = Σ(xᵢ-μ)² / N
- Sample:
s² = Σ(xᵢ-x̄)² / (n-1)
Units: Original units squared Purpose: Measures spread/dispersion of data Example: If data varies ±3 from mean, variance ≈ 9
Standard Deviation (σ or s)
Definition: Square root of variance Formula:
- Population:
σ = √(σ²)
- Sample:
s = √(s²)
Units: Same as original data Purpose: Interpretable measure of spread Relationship: σ = √(variance)
Relationships Between Variables
Covariance (Cov(X,Y))
Definition: Measures how two variables change together
Formula: Cov(X,Y) = Σ(xᵢ-x̄)(yᵢ-ȳ) / (n-1)
Units: Units of X × Units of Y
Range: -∞ to +∞
Interpretation:
- Positive: Variables increase together
- Negative: One increases as other decreases
- Zero: No linear relationship
Correlation (r or ρ)
Definition: Standardized measure of linear relationship
Formula: r = Cov(X,Y) / (σₓ × σᵧ)
Units: Dimensionless
Range: -1 to +1
Interpretation:
- +1: Perfect positive linear relationship
- -1: Perfect negative linear relationship
- 0: No linear relationship
- ±0.7: Strong relationship
- ±0.3: Weak relationship
Standardization
Z-Scores
Definition: Number of standard deviations from the mean
Formula: Z = (x - μ) / σ
Units: Standard deviations
Range: Typically -3 to +3
Purpose:
- Standardize different scales
- Compare across datasets
- Identify outliers (|Z| > 2 or 3) Interpretation: Z = 2 means “2 standard deviations above mean”
Covariance Matrices
Definition: Square matrix containing variances and covariances of multiple variables Structure:
- Size: n×n matrix for n variables
- Symmetric:
Cov(Xᵢ,Xⱼ) = Cov(Xⱼ,Xᵢ)
- Positive semi-definite: All eigenvalues ≥ 0 Formula: For variables X₁, X₂, …, Xₙ
[Var(X₁) Cov(X₁,X₂) ... Cov(X₁,Xₙ)]
S = [Cov(X₂,X₁) Var(X₂) ... Cov(X₂,Xₙ)]
[ ... ... ... ... ]
[Cov(Xₙ,X₁) Cov(Xₙ,X₂) ... Var(Xₙ) ]
Uses:
- PCA analysis
- Multivariate statistics
- Portfolio risk analysis
- Machine learning feature relationships
Matrix Elements
Diagonal Elements
Definition: Elements where row index = column index
In Covariance Matrix: Cov(Xᵢ,Xᵢ) = Var(Xᵢ)
In Correlation Matrix: Corr(Xᵢ,Xᵢ) = 1
Purpose: Self-relationships (variances or perfect correlation)
Example: In 2×2 matrix, positions (1,1) and (2,2)
Off-Diagonal Elements
Definition: Elements where row index ≠ column index
In Covariance Matrix: Cov(Xᵢ,Xⱼ)
where i≠j
In Correlation Matrix: Corr(Xᵢ,Xⱼ)
where i≠j
Purpose: Cross-relationships between different variables
Example: In 2×2 matrix, positions (1,2) and (2,1)
Linear Algebra Concepts
Eigenvectors (v)
Definition: “Special directions” where the matrix only stretches/shrinks the vector, never rotates it
Mathematical: Solutions to Av = λv
Properties:
- Direction vectors that remain unchanged under matrix transformation
- Only magnitude changes, not direction
- Normalized to unit length (||v|| = 1) In PCA: Point in directions of maximum/minimum variance
Eigenvalues (λ)
Definition: “How much stretching/shrinking” happens in each special direction
Mathematical: Scalar values that satisfy Av = λv
Interpretation:
- λ > 1: Vector gets stretched (amplified)
- 0 < λ < 1: Vector gets shrunk
- λ < 0: Vector gets flipped and scaled
- λ = 0: Vector collapses to zero In PCA: Measure amount of variance in each principal direction
Eigenvalue-Eigenvector Relationship
Fundamental Equation: Av = λv
Process:
- Find eigenvalues by solving:
det(A - λI) = 0
- For each λ, find eigenvector by solving:
(A - λI)v = 0
Result: Each eigenvalue has corresponding eigenvector In PCA: Largest eigenvalue gives first principal component direction Key Insight: The “natural coordinate system” is simply the data’s preferred viewing angle - the orientation that captures maximum variance with minimum dimensions.
Mathematical Foundation
THEOREM (provable): “Eigenvectors of symmetric matrices corresponding to distinct eigenvalues are orthogonal”
This theorem is part of the Spectral Theorem for Symmetric Matrices, which states:
- Every symmetric matrix can be diagonalized by orthogonal eigenvectors
- This is a fundamental result in linear algebra
Implication for PCA: Since covariance matrices are always symmetric, orthogonal principal components are mathematically guaranteed.
Matrix Examples
2×2 Covariance Matrix
[Var(X) Cov(X,Y)] ← Diagonal: variances
[Cov(X,Y) Var(Y) ] ← Off-diagonal: covariances
2×2 Correlation Matrix
[1 Corr(X,Y)] ← Diagonal: always 1
[Corr(X,Y) 1 ] ← Off-diagonal: correlations
Key Relationships
- Variance to Standard Deviation:
σ = √(σ²)
- Covariance to Correlation:
r = Cov(X,Y) / (σₓ × σᵧ)
- Raw to Standardized:
Z = (x - μ) / σ
- Matrix Symmetry:
Cov(X,Y) = Cov(Y,X)
Quick Reference
Measure | Range | Units | Purpose |
---|---|---|---|
Mean | Any | Original | Central tendency |
Variance | 0 to ∞ | Original² | Spread |
Std Dev | 0 to ∞ | Original | Interpretable spread |
Covariance | -∞ to ∞ | X×Y units | Raw relationship |
Correlation | -1 to +1 | None | Standardized relationship |
Z-Score | -∞ to ∞ | Std devs | Standardized position |
Eigenvalues | 0 to ∞* | Variance units | Variance in PC direction |
Eigenvectors | Unit length | Dimensionless | Principal directions |
Covariance Matrix | Symmetric | Mixed units | All variable relationships |
*For covariance matrices (positive semi-definite)