A practical decision framework for choosing between LU, QR, SVD, Cholesky, Eigendecomposition, Schur, Jordan, and LDL factorizations. Stop guessing and start picking the right tool for your problem.
Matrix decomposition is the single most important toolkit in numerical linear algebra. Every serious computation -- solving linear systems, fitting models, compressing data, analyzing stability -- relies on factoring matrices into structured pieces. But there are at least eight major decompositions, and choosing the wrong one means wasted computation, numerical instability, or flat-out incorrect results.
The difference is not academic. Choosing Cholesky over general LU when your matrix is symmetric positive definite cuts your computation in half and eliminates the need for pivoting. Using SVD when QR would suffice triples your runtime for no additional benefit. Attempting eigendecomposition on a defective matrix produces garbage unless you switch to Schur form. And applying LU to a rectangular least-squares problem simply does not work -- it requires a square matrix.
This guide gives you a concrete decision framework. We cover when each decomposition applies, how they compare head-to-head on speed, stability, and applicability, what to use for every common application domain, and the mistakes that trip up students and practitioners alike. By the end, you will never have to guess which decomposition to reach for.
The eight decompositions we cover are:
Walk through these questions in order. Each branch leads you to the best decomposition for your situation.
Non-square matrices eliminate most decompositions. Use QR for least squares or SVD for rank analysis and pseudoinverse.
Continue to Step 2 below.
Yes: Use Cholesky (A = LLT) -- the fastest option.
No: Use LDLT -- handles indefinite symmetric matrices without square roots.
Not symmetric. Continue to Step 3 below.
LU with partial pivoting (PA = LU) is the standard for solving square linear systems. Especially efficient for multiple right-hand sides.
If the matrix is diagonalizable, use A = PDP-1. If it may be defective (repeated eigenvalues, missing eigenvectors), use Schur instead.
QR decomposition is the standard for least squares with full column rank. If rank-deficient, switch to SVD.
SVD reveals the rank, provides the best low-rank approximation (Eckart-Young theorem), and powers PCA, image compression, and recommendation systems.
This flowchart covers the most common scenarios. For edge cases -- defective matrices, theoretical canonical forms, and specialized applications -- read the detailed comparisons below.
Each comparison addresses the key question: when should you pick one decomposition over the other? We cover speed, stability, and applicability.
2n³/3 flops vs QR's 2n³ (roughly 3x faster for square matrices)lstsq and fall back to SVD when rank issues are detected.
n³/3 flops vs LU's 2n³/3All eight major decompositions at a glance. Use this table as a quick reference for matrix requirements, computational cost, and primary applications.
| Decomposition | Form | Matrix Requirements | Complexity | Best For |
|---|---|---|---|---|
| LU | PA = LU | Square, non-singular | 2n³/3 |
Solving Ax = b, determinants, matrix inversion |
| QR | A = QR | Any (m ≥ n) | 2mn² |
Least squares, eigenvalue algorithms, orthogonalization |
| SVD | A = UΣVT | Any matrix | ~11mn² |
Rank, pseudoinverse, PCA, low-rank approximation |
| Cholesky | A = LLT | Symmetric positive definite | n³/3 |
SPD systems, optimization, Monte Carlo, Kalman filters |
| Eigen | A = PDP-1 | Square, diagonalizable | ~10n³ |
Eigenvalues, stability analysis, matrix powers, PCA |
| Schur | A = QTQ* | Square | ~10n³ |
Matrix functions, defective matrices, eigenvalue extraction |
| Jordan | A = PJP-1 | Square (theoretical only) | Not numerically stable | Theoretical analysis, canonical form, proofs |
| LDLT | A = LDLT | Symmetric | n³/3 |
Indefinite symmetric systems, inertia, KKT systems |
A few notes on reading this table. Complexity figures are leading-order flop counts; constant factors and lower-order terms are omitted. "Any matrix" means the decomposition works for rectangular matrices of any dimension. "Diagonalizable" means the matrix must have n linearly independent eigenvectors, which is generic (almost all matrices satisfy this) but not universal.
Not sure which decomposition fits your problem? Find your application below.
You have a coefficient matrix A and a right-hand side vector b, and need to find x. The choice depends entirely on the structure of A.
LU General square systems. The default choice. Compute PA = LU once, then solve Ly = Pb (forward substitution) and Ux = y (back substitution). If you have multiple right-hand sides b1, b2, ..., the LU factorization is reused -- only the O(n2) substitution is repeated each time.
Cholesky If A is symmetric positive definite, Cholesky is 2x faster than LU and uses half the storage. Always prefer it when applicable.
QR For ill-conditioned square systems where you need maximum numerical accuracy, QR provides better stability at the cost of roughly 3x more computation.
You have more equations than unknowns (m > n) and want the best approximate solution. This is the foundation of linear regression, curve fitting, and data fitting across all of science and engineering.
QR The standard method. Decompose A = QR, then solve Rx = QTb. Fast, numerically stable, and sufficient when A has full column rank.
SVD When A may be rank-deficient or nearly so. SVD gives the minimum-norm least-squares solution and lets you inspect which singular values are too small (numerically zero). The solution is x = VΣ+UTb.
Normal Equations Forming ATAx = ATb and solving with Cholesky is the fastest approach, but it is numerically less stable because the condition number squares: cond(ATA) = cond(A)2. Only use this when A is well-conditioned.
Eigenvalue problems arise in vibration analysis, stability of dynamical systems, quantum mechanics, graph theory (spectral methods), and many areas of physics and engineering.
Eigendecomposition The direct approach: A = PDP-1 where D contains eigenvalues and P contains the corresponding eigenvectors. For symmetric matrices, this is especially clean since P is orthogonal and all eigenvalues are real.
Schur When the matrix may be defective (non-diagonalizable) or when you only need eigenvalues without eigenvectors. The Schur form A = QTQ* always exists and places eigenvalues on the diagonal of the upper triangular matrix T. Most practical eigenvalue algorithms compute the Schur form internally.
Principal Component Analysis is arguably the most widely used dimensionality reduction technique in data science, compressing high-dimensional data while preserving maximum variance.
SVD Compute SVD of the centered data matrix directly: X = UΣVT. The right singular vectors (columns of V) are the principal components. The singular values give the explained variance. This approach is numerically superior to the covariance method.
Eigendecomposition Applied to the covariance matrix C = XTX/(n-1). Eigenvalues give variances, eigenvectors give principal directions. This works but squares the condition number and is less numerically stable than SVD applied directly to the data matrix.
The matrix exponential is central to solving linear ordinary differential equations, control theory (state-space models), and continuous-time Markov chains.
Eigendecomposition If A = PDP-1, then eAt = PeDtP-1, where eDt simply exponentiates each diagonal entry. Clean and efficient when A is diagonalizable.
Schur For non-diagonalizable or non-normal matrices, use the Schur form A = QTQ*, then eAt = QeTtQ*. The exponential of the upper triangular T can be computed efficiently using the Parlett recurrence. This is what production codes (like MATLAB's expm) use internally.
Newton-type methods for unconstrained and constrained optimization require solving a linear system at every iteration, where the coefficient matrix is the Hessian (or an approximation of it).
Cholesky When the objective is convex (Hessian is SPD), Cholesky is the fastest and most stable option. This is the standard in interior-point methods and trust-region methods when positive definiteness is guaranteed.
LDLT For non-convex optimization where the Hessian can be indefinite (near saddle points, or for KKT systems in constrained optimization). LDLT with Bunch-Kaufman pivoting handles indefiniteness gracefully and reveals the inertia of the Hessian.
LU General fallback for non-symmetric coefficient matrices or augmented systems that lose symmetry.
Signal processing routinely works with correlation matrices, covariance matrices, Toeplitz structures, and subspace methods.
SVD Powers the MUSIC algorithm for spectral estimation, optimal rank-1 beamforming, and subspace-based methods for direction of arrival. The SVD separates signal and noise subspaces cleanly.
Eigendecomposition Applied to autocorrelation matrices for spectral analysis (ESPRIT, MUSIC). For real symmetric covariance matrices, eigendecomposition and SVD produce equivalent results.
Cholesky Whitening transforms use Cholesky: if C = LLT, then L-1x transforms correlated signals into uncorrelated (white) signals. Also used for matched filtering and prewhitening.
Behind the high-level APIs, machine learning algorithms depend heavily on matrix decompositions at their computational core.
SVD Recommendation systems (collaborative filtering via low-rank matrix factorization), latent semantic analysis (LSA/LSI) for natural language processing, image compression, matrix completion (the Netflix Prize approach), and computing pseudoinverses for linear regression.
Eigendecomposition Spectral clustering (eigenvectors of the graph Laplacian), kernel PCA, Google's PageRank algorithm (dominant eigenvector of the link matrix), Laplacian eigenmaps for manifold learning, and the power method in iterative eigenvalue algorithms.
Cholesky Gaussian process regression (inverting the kernel matrix K via K = LLT), sampling from multivariate Gaussian distributions, and preconditioning in iterative solvers. Cholesky also enables efficient log-determinant computation for GP marginal likelihood.
QR The numerical backbone of linear regression in production libraries like R, statsmodels, and scikit-learn. QR factorization ensures numerically stable coefficient estimates even for near-collinear predictors.
These are the pitfalls that catch students, engineers, and even experienced practitioners off guard when choosing and applying matrix decompositions.
LU decomposition requires a square matrix. If you have an overdetermined system (more equations than unknowns), you cannot apply LU directly to the rectangular matrix A. Students sometimes try to make it work by truncating rows or padding with zeros -- both produce wrong answers.
Not every square matrix is diagonalizable. Defective matrices (where the geometric multiplicity of an eigenvalue is less than its algebraic multiplicity) do not have n linearly independent eigenvectors and cannot be written as A = PDP-1. A classic example is the matrix [[0, 1], [0, 0]] -- it has eigenvalue 0 with algebraic multiplicity 2 but geometric multiplicity 1.
Cholesky decomposition requires the matrix to be both symmetric AND positive definite. A common mistake is applying it to a matrix that is symmetric but has a negative or zero eigenvalue (semi-definite or indefinite). The algorithm will encounter a square root of a negative number and fail -- or worse, silently produce incorrect results in some implementations.
Jordan normal form is discontinuous: an infinitesimal perturbation can change the Jordan structure entirely. A 3×3 Jordan block splits into three 1×1 blocks under a perturbation smaller than machine epsilon. Any numerical computation of JNF in floating-point arithmetic is inherently meaningless because the result depends on rounding errors, not on the actual matrix.
SVD is the "Swiss army knife" of decompositions, but it is significantly more expensive than QR. For a standard full-rank least-squares problem, using SVD costs 3-6 times more than QR with no additional benefit. Over-reliance on SVD is a common source of unnecessary computational expense in data pipelines and production systems.
Solving ATAx = ATb (the normal equations approach) squares the condition number: cond(ATA) = cond(A)2. If A has condition number 106, the normal equations have condition number 1012, and you lose 12 digits of precision in double-precision arithmetic (which only has about 16 digits). You are left with roughly 4 correct digits.
Put theory into practice. Enter your matrix and see each decomposition computed step by step with detailed explanations.
Factor a matrix into Lower and Upper triangular matrices. Essential for solving systems of linear equations efficiently.
Open calculator →Decompose into an orthogonal matrix Q and upper triangular R. Used in least squares regression and eigenvalue algorithms.
Open calculator →The most general matrix decomposition. Factorize any matrix into U, Σ, VT. Powers recommendation systems and data compression.
Open calculator →Efficient factorization for symmetric positive definite matrices into LLT. Widely used in Monte Carlo simulations and optimization.
Open calculator →Find eigenvalues and eigenvectors. Fundamental to PCA, quantum mechanics, vibration analysis, and stability theory.
Open calculator →