You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+23-14Lines changed: 23 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,26 +8,35 @@ It is particularly useful in high-dimensional settings where the number of featu
8
8
By constraining the covariance matrices in this manner, we achieve a balance between flexibility and computational efficiency while avoiding overfitting.
9
9
10
10
We consider a dataset represented as $X \in \mathbb{R}^{n \times m}$ with **$n$** data points and **$m$** features, potentially with $m \gg n$. A Gaussian Mixture Model with $K$ components over elements $x^{(i)} \in \mathbb{R}^m$ is defined as:
with non-negative mixture weights $\pi_k$ that sum to one. Each Gaussian component has mean $\mu_k \in \mathbb{R}^m$ and covariance matrix of the form:
15
+
15
16
$$
16
-
\Sigma_k = \text{diag}(d_k) + L_k L_k^T
17
+
\Sigma_k = \text{diag}(d_k) + L_k L_k^T\text{,}
17
18
$$
19
+
18
20
where:
19
21
- $d_k \in \mathbb{R}^m$ is the diagonal vector capturing independent feature noise,
20
22
- $L_k \in \mathbb{R}^{m \times k}$ (with rank $k \ll m$) captures the dominant covariance structure via a low-rank factor.
21
23
22
-
The EM algorithm for fitting GMMs alternates between an **E**xpectation and a **M**aximization:
24
+
The EM algorithm for fitting GMMs alternates between **E**xpectation and **M**aximization:
25
+
26
+
1.**E-step:** Compute responsibilities based on current parameters:
Since this can be prohibitively expensive in large dimensions, we leverage the structure of the covariance matrices to calculate the responsibilities using an $\mathcal{O}(k^3)$ matrix inversion rather than a full $\mathcal{O}(m^3)$ inversion.
30
-
-**M-step:** Update parameters by maximizing expected complete-data log-likelihood.
31
-
Since updating $\Sigma_k$ directly is intractable in high dimensions, an **inner EM loop** is performed using **Factor Analysis**, where $L_k$ and $D_k$ are iteratively estimated to approximate the full covariance structure.
Since this can be prohibitively expensive in large dimensions, we leverage the structure of the covariance matrices to calculate the responsibilities using an $\mathcal{O}(k^3)$ matrix inversion rather than a full $\mathcal{O}(m^3)$ inversion.
40
+
Additionally, since updating $\Sigma_k$ directly is intractable in high dimensions, we perform an **inner EM loop** during maximization is performed using **Factor Analysis**, where $L_k$ and $D_k$ are iteratively estimated to approximate the full covariance structure.
32
41
33
-
This low-rank plus diagonal structure is particularly advantageous in scenarios such as **text modeling, gene expression analysis, and compressed sensing**, where **\(m \gg n\)** leads to singular or poorly conditioned full covariance estimates. Our package leverages efficient matrix decompositions and batched computations to ensure scalability, making it well-suited for large-scale, high-dimensional datasets.
42
+
This low-rank plus diagonal structure is particularly advantageous in settings such as **time-series analysis (e.g. for finance), text modeling, gene expression analysis, and compressed sensing**, where $m \gg n$ leads to singular or poorly conditioned full covariance estimates. Our package leverages efficient matrix decompositions and batched computations to ensure scalability, making it well-suited for large-scale, high-dimensional datasets.
0 commit comments