%------------------------------------------------------------------------------- % The UserGuide.stex file. Processed into UserGuide.tex via sed. %------------------------------------------------------------------------------- \documentclass[11pt]{article} \newcommand{\m}[1]{{\bf{#1}}} % for matrices and vectors \newcommand{\tr}{^{\sf T}} % transpose \newcommand{\he}{^{\sf H}} % complex conjugate transpose \newcommand{\implies}{\rightarrow} \topmargin 0in \textheight 9in \oddsidemargin 0pt \evensidemargin 0pt \textwidth 6.5in \begin{document} \author{Timothy A. Davis \\ Dept. of Computer and Information Science and Engineering \\ Univ. of Florida, Gainesville, FL} \title{UMFPACK Version 5.1 User Guide} \date{May 31, 2007} \maketitle %------------------------------------------------------------------------------- \begin{abstract} UMFPACK is a set of routines for solving unsymmetric sparse linear systems, $\m{Ax}=\m{b}$, using the Unsymmetric MultiFrontal method and direct sparse LU factorization. It is written in ANSI/ISO C, with a MATLAB interface. UMFPACK relies on the Level-3 Basic Linear Algebra Subprograms (dense matrix multiply) for its performance. This code works on Windows and many versions of Unix (Sun Solaris, Red Hat Linux, IBM AIX, SGI IRIX, and Compaq Alpha). \end{abstract} %------------------------------------------------------------------------------- Technical Report TR-04-003 (revised) UMFPACK Version 5.1, Copyright\copyright 1995-2006 by Timothy A. Davis. All Rights Reserved. UMFPACK is available under alternate licences; contact T. Davis for details. {\bf UMFPACK License:} Your use or distribution of UMFPACK or any modified version of UMFPACK implies that you agree to this License. This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Permission is hereby granted to use or copy this program under the terms of the GNU LGPL, provided that the Copyright, this License, and the Availability of the original version is retained on all copies. User documentation of any code that uses this code or any modified version of this code must cite the Copyright, this License, the Availability note, and "Used by permission." Permission to modify the code and to distribute modified code is granted, provided the Copyright, this License, and the Availability note are retained, and a notice that the code was modified is included. {\bf Availability:} http://www.cise.ufl.edu/research/sparse/umfpack {\bf Acknowledgments:} This work was supported by the National Science Foundation, under grants DMS-9504974, DMS-9803599, and CCR-0203270. The upgrade to Version 4.1 and the inclusion of the symmetric and 2-by-2 pivoting strategies were done while the author was on sabbatical at Stanford University and Lawrence Berkeley National Laboratory. %------------------------------------------------------------------------------- \newpage %------------------------------------------------------------------------------- \tableofcontents %------------------------------------------------------------------------------- \newpage \section{Overview} %------------------------------------------------------------------------------- UMFPACK\footnote{Pronounced with two syllables: umph-pack} Version 5.0 is a set of routines for solving systems of linear equations, $\m{Ax}=\m{b}$, when $\m{A}$ is sparse and unsymmetric. It is based on the Unsymmetric-pattern MultiFrontal method \cite{DavisDuff97,DavisDuff99}. UMFPACK factorizes $\m{PAQ}$, $\m{PRAQ}$, or $\m{PR}^{-1}\m{AQ}$ into the product $\m{LU}$, where $\m{L}$ and $\m{U}$ are lower and upper triangular, respectively, $\m{P}$ and $\m{Q}$ are permutation matrices, and $\m{R}$ is a diagonal matrix of row scaling factors (or $\m{R}=\m{I}$ if row-scaling is not used). Both $\m{P}$ and $\m{Q}$ are chosen to reduce fill-in (new nonzeros in $\m{L}$ and $\m{U}$ that are not present in $\m{A}$). The permutation $\m{P}$ has the dual role of reducing fill-in and maintaining numerical accuracy (via relaxed partial pivoting and row interchanges). The sparse matrix $\m{A}$ can be square or rectangular, singular or non-singular, and real or complex (or any combination). Only square matrices $\m{A}$ can be used to solve $\m{Ax}=\m{b}$ or related systems. Rectangular matrices can only be factorized. UMFPACK first finds a column pre-ordering that reduces fill-in, without regard to numerical values. It scales and analyzes the matrix, and then automatically selects one of three strategies for pre-ordering the rows and columns: {\em unsymmetric}, {\em 2-by-2}, and {\em symmetric}. These strategies are described below. First, all pivots with zero Markowitz cost are eliminated and placed in the LU factors. The remaining submatrix $\m{S}$ is then analyzed. The following rules are applied, and the first one that matches defines the strategy. \begin{itemize} \item Rule 1: $\m{A}$ rectangular $\implies$ unsymmetric. \item Rule 2: If the zero-Markowitz elimination results in a rectangular $\m{S}$, or an $\m{S}$ whose diagonal has not been preserved, the unsymmetric strategy is used. \item The symmetry $\sigma_1$ of $\m{S}$ is computed. It is defined as the number of {\em matched} off-diagonal entries, divided by the total number of off-diagonal entries. An entry $s_{ij}$ is matched if $s_{ji}$ is also an entry. They need not be numerically equal. An {\em entry} is a value in $\m{A}$ which is present in the input data structure. All nonzeros are entries, but some entries may be numerically zero. Rule 3: $\sigma_1 < 0.1 \implies$ unsymmetric. The matrix is very unsymmetric. \item Let $d$ be the number of nonzero entries on the diagonal of $\m{S}$. Let $\m{S}$ be $\nu$-by-$\nu$. Rule 4: $(\sigma_1 \ge 0.7) \:\wedge\: (d = \nu) \implies$ symmetric. The matrix has a nearly symmetric nonzero pattern, and a zero-free diagonal. \end{itemize} If the strategy has not yet been determined, the 2-by-2 strategy is attempted. A row permutation $\m{P}_2$ is found which attempts to reduce the number of small diagonal entries of $\m{P}_2 \m{S}$. An entry $s_{ij}$ is determined to be small if $|s_{ij}| < 0.01 \max |s_{*j}|$, or large otherwise. If $s_{ii}$ is numerically small, the method attempts to swap two rows $i$ and $j$, such that both $s_{ij}$ and $s_{ji}$ are large. Once these rows are swapped, they remain in place. Let $\sigma_2$ be the symmetry of $\m{P}_2 \m{S}$, and let $d_2$ be the number of nonzero entries (either small or large) on the diagonal of $\m{P}_2 \m{S}$. \begin{itemize} \item Rule 5: ($\sigma_2 > 1.1 \sigma_1) \:\wedge\: (d_2 > 0.9 \nu) \implies$ 2-by-2. The 2-by-2 permutation has made the matrix significantly more symmetric. \item Rule 6: $\sigma_2 < 0.7 \sigma_1 \implies$ unsymmetric. The 2-by-2 strategy has significantly deteriorated the symmetry, \item Rule 7: $\sigma_2 < 0.25 \implies$ unsymmetric. The matrix is still very unsymmetric. \item Rule 8: $\sigma_2 \ge 0.51 \implies$ 2-by-2. The matrix is roughly symmetric. \item Rule 9: $\sigma_2 \ge 0.999 \sigma_1 \implies$ 2-by-2. The 2-by-2 permutation has preserved symmetry, or made it only slightly worse. \item Rule 10: if no rule has yet triggered, use the unsymmetric strategy. \end{itemize} Each strategy is described below: \begin{itemize} \item {\em unsymmetric}: The column pre-ordering of $\m{S}$ is computed by a modified version of COLAMD \cite{DavisGilbertLarimoreNg00_algo,DavisGilbertLarimoreNg00,Larimore98}. The method finds a symmetric permutation $\m{Q}$ of the matrix $\m{S}\tr\m{S}$ (without forming $\m{S}\tr\m{S}$ explicitly). This is a good choice for $\m{Q}$, since the Cholesky factors of $\m{(SQ)\tr(SQ)}$ are an upper bound (in terms of nonzero pattern) of the factor $\m{U}$ for the unsymmetric LU factorization ($\m{PSQ}=\m{LU}$) regardless of the choice of $\m{P}$ \cite{GeorgeNg85,GeorgeNg87,GilbertNg93}. This modified version of COLAMD also computes the column elimination tree and post-orders the tree. It finds the upper bound on the number of nonzeros in L and U. It also has a different threshold for determining dense rows and columns. During factorization, the column pre-ordering can be modified. Columns within a single super-column can be reshuffled, to reduce fill-in. Threshold partial pivoting is used with no preference given to the diagonal entry. Within a given pivot column $j$, an entry $a_{ij}$ can be chosen if $|a_{ij}| \ge 0.1 \max |a_{*j}|$. Among those numerically acceptable entries, the sparsest row $i$ is chosen as the pivot row. \item {\em 2-by-2}: The symmetric strategy (see below) is applied to the matrix $\m{P}_2 \m{S}$, rather than $\m{S}$. \item {\em symmetric}: The column ordering is computed from AMD \cite{AmestoyDavisDuff96,AmestoyDavisDuff03}, applied to the pattern of $\m{S}+\m{S}\tr$ followed by a post-ordering of the supernodal elimination tree of $\m{S}+\m{S}\tr$. No modification of the column pre-ordering is made during numerical factorization. Threshold partial pivoting is used, with a strong preference given to the diagonal entry. The diagonal entry is chosen if $a_{jj} \ge 0.001 \max |a_{*j}|$. Otherwise, a sparse row is selected, using the same method used by the unsymmetric strategy. \end{itemize} The symmetric and 2-by-2 strategies, and their automatic selection, are new to Version 4.1. Version 4.0 only used the unsymmetric strategy. Once the strategy is selected, the factorization of the matrix $\m{A}$ is broken down into the factorization of a sequence of dense rectangular frontal matrices. The frontal matrices are related to each other by a supernodal column elimination tree, in which each node in the tree represents one frontal matrix. This analysis phase also determines upper bounds on the memory usage, the floating-point operation count, and the number of nonzeros in the LU factors. UMFPACK factorizes each {\em chain} of frontal matrices in a single working array, similar to how the unifrontal method \cite{dusc:96} factorizes the whole matrix. A chain of frontal matrices is a sequence of fronts where the parent of front $i$ is $i$+1 in the supernodal column elimination tree. For the nonsingular matrices factorized with the unsymmetric strategy, there are exactly the same number of chains as there are leaves in the supernodal column elimination tree. UMFPACK is an outer-product based, right-looking method. At the $k$-th step of Gaussian elimination, it represents the updated submatrix $\m{A}_k$ as an implicit summation of a set of dense sub-matrices (referred to as {\em elements}, borrowing a phrase from finite-element methods) that arise when the frontal matrices are factorized and their pivot rows and columns eliminated. Each frontal matrix represents the elimination of one or more columns; each column of $\m{A}$ will be eliminated in a specific frontal matrix, and which frontal matrix will be used for which column is determined by the pre-analysis phase. The pre-analysis phase also determines the worst-case size of each frontal matrix so that they can hold any candidate pivot column and any candidate pivot row. From the perspective of the analysis phase, any candidate pivot column in the frontal matrix is identical (in terms of nonzero pattern), and so is any row. However, the numeric factorization phase has more information than the analysis phase. It uses this information to reorder the columns within each frontal matrix to reduce fill-in. Similarly, since the number of nonzeros in each row and column are maintained (more precisely, COLMMD-style approximate degrees \cite{GilbertMolerSchreiber}), a pivot row can be selected based on sparsity-preserving criteria (low degree) as well as numerical considerations (relaxed threshold partial pivoting). When the symmetric or 2-by-2 strategies are used, the column preordering is not refined during numeric factorization. Row pivoting for sparsity and numerical accuracy is performed if the diagonal entry is too small. More details of the method, including experimental results, are described in \cite{Davis03,Davis03_algo}, available at http://www.cise.ufl.edu/tech-reports. %------------------------------------------------------------------------------- \section{Availability} %------------------------------------------------------------------------------- In addition to appearing as a Collected Algorithm of the ACM, UMFPACK is available at http://www.cise.ufl.edu/research/sparse. It is included as a built-in routine in MATLAB. Version 4.0 (in MATLAB 6.5) does not have the symmetric or 2-by-2 strategies and it takes less advantage of the level-3 BLAS \cite{DaydeDuff99,ACM679a,ATLAS,GotoVandeGeijn02}. Versions 5.0 through v4.1 tend to be much faster than Version 4.0, particularly on unsymmetric matrices with mostly symmetric nonzero pattern (such as finite element and circuit simulation matrices). Version 3.0 and following make use of a modified version of COLAMD V2.0 by Timothy A.~Davis, Stefan Larimore, John Gilbert, and Esmond Ng. The original COLAMD V2.1 is available in as a built-in routine in MATLAB V6.0 (or later), and at http://www.cise.ufl.edu/research/sparse. These codes are also available in Netlib \cite{netlib} at http://www.netlib.org. UMFPACK Versions 2.2.1 and earlier, co-authored with Iain Duff, are available at http://www.cise.ufl.edu/research/sparse and as MA38 (functionally equivalent to Version 2.2.1) in the Harwell Subroutine Library. %------------------------------------------------------------------------------- \section{Primary changes from prior versions} %------------------------------------------------------------------------------- A detailed list of changes is in the {\tt ChangeLog} file. %------------------------------------------------------------------------------- \subsection{Version 5.1.0} %------------------------------------------------------------------------------- Port of MATLAB interface to 64-bit MATLAB. %------------------------------------------------------------------------------- \subsection{Version 5.0.3} %------------------------------------------------------------------------------- Renamed the MATLAB function to {\tt umfpack2}, so as not to confict with itself (the MATLAB built-in version of UMFPACK). %------------------------------------------------------------------------------- \subsection{Version 5.0} %------------------------------------------------------------------------------- Changed {\tt long} to {\tt UF\_long}, controlled by the {\tt UFconfig.h} file. A {\tt UF\_long} is normally just {\tt long}, except on the Windows 64 (WIN64) platform. In that case, it becomes {\tt \_\_int64}. %------------------------------------------------------------------------------- \subsection{Version 4.6} %------------------------------------------------------------------------------- Added additional options to {\tt umf\_solve.c}. %------------------------------------------------------------------------------- \subsection{Version 4.5} %------------------------------------------------------------------------------- Added function pointers for malloc, calloc, realloc, free, printf, hypot, and complex divisiion, so that these functions can be redefined at run-time. Added a version number so you can determine the version of UMFPACK at run time or compile time. UMFPACK requires AMD v2.0 or later. %------------------------------------------------------------------------------- \subsection{Version 4.4} %------------------------------------------------------------------------------- Bug fix in strategy selection in {\tt umfpack\_*\_qsymbolic}. Added packed complex case for all complex input/output arguments. Added {\tt umfpack\_get\_determinant}. Added minimal support for Microsoft Visual Studio (the {\tt umf\_multicompile.c} file). %------------------------------------------------------------------------------- \subsection{Version 4.3.1} %------------------------------------------------------------------------------- Minor bug fix in the forward/backsolve. This bug had the effect of turning off iterative refinement when solving $\m{A}\tr\m{x}=\m{b}$ after factorizing $\m{A}$. UMFPACK mexFunction now factorizes $\m{A}\tr$ in its forward-slash operation. %------------------------------------------------------------------------------- \subsection{Version 4.3} %------------------------------------------------------------------------------- No changes are visible to the C or MATLAB user, except the presence of one new control parameter in the {\tt Control} array, and three new statistics in the {\tt Info} array. The primary change is the addition of an (optional) drop tolerance. %------------------------------------------------------------------------------- \subsection{Version 4.1} %------------------------------------------------------------------------------- The following is a summary of the main changes that are visible to the C or MATLAB user: \begin{enumerate} \item New ordering strategies added. No changes are required in user code (either C or MATLAB) to use the new default strategy, which is an automatic selection of the unsymmetric, symmetric, or 2-by-2 strategies. \item Row scaling added. This is only visible to the MATLAB caller when using the form {\tt [L,U,P,Q,R] = umfpack (A)}, to retrieve the LU factors. Likewise, it is only visible to the C caller when the LU factors are retrieved, or when solving systems with just $\m{L}$ or $\m{U}$. New C-callable and MATLAB-callable routines are included to get and to apply the scale factors computed by UMFPACK. Row scaling is enabled by default, but can be disabled. Row scaling usually leads to a better factorization, particularly when the symmetric strategy is used. \item Error code {\tt UMFPACK\_ERROR\_problem\_to\_large} removed. Version 4.0 would generate this error when the upper bound memory usage exceeded 2GB (for the {\tt int} version), even when the actual memory usage was less than this. The new version properly handles this case, and can successfully factorize the matrix if sufficient memory is available. \item New control parameters and statistics provided. \item The AMD symmetric approximate minimum degree ordering routine added \cite{AmestoyDavisDuff96,AmestoyDavisDuff03}. It is used by UMFPACK, and can also be called independently from C or MATLAB. \item The {\tt umfpack} mexFunction now returns permutation matrices, not permutation vectors, when using the form {\tt [L,U,P,Q] = umfpack (A)} or the new form {\tt [L,U,P,Q,R] = umfpack (A)}. \item New arguments added to the user-callable routines {\tt umfpack\_*\_symbolic}, {\tt umfpack\_*\_qsymbolic}, {\tt umfpack\_*\_get\_numeric}, and {\tt umfpack\_*\_get\_symbolic}. The symbolic analysis now makes use of the numerical values of the matrix $\m{A}$, to guide the 2-by-2 strategy. The subsequent matrix passed to the numeric factorization step does not have to have the same numerical values. All of the new arguments are optional. If you do not wish to include them, simply pass {\tt NULL} pointers instead. The 2-by-2 strategy will assume all entries are numerically large, for example. \item New routines added to save and load the {\tt Numeric} and {\tt Symbolic} objects to and from a binary file. \item A Fortran interface added. It provides access to a subset of UMFPACK's features. \item You can compute an incomplete LU factorization, by dropping small entries from $\m{L}$ and $\m{U}$. By default, no nonzero entry is dropped, no matter how small in absolute value. This feature is new to Version 4.3. \end{enumerate} %------------------------------------------------------------------------------- \section{Using UMFPACK in MATLAB} %------------------------------------------------------------------------------- The easiest way to use UMFPACK is within MATLAB. Version 4.3 is a built-in routine in MATLAB 7.0.4, and is used in {\tt x = A}$\backslash${\tt b} when {\tt A} is sparse, square, unsymmetric (or symmetric but not positive definite), and with nonzero entries that are not confined in a narrow band. It is also used for the {\tt [L,U,P,Q] = lu (A)} usage of {\tt lu}. Type {\tt help lu} in MATLAB 6.5 or later for more details. To use the UMFPACK mexFunction, you must download and compile it, since the mexFunction itself is not part of MATLAB. The following discussion assumes that you have MATLAB Version 6.0 or later (which includes the BLAS, and the {\tt colamd} ordering routine). To compile both the UMFPACK and AMD mexFunctions, just type {\tt make} in the Unix system shell, while in the {\tt UMFPACK} directory. You can also type {\tt umfpack\_make} in MATLAB, if you are in the {\tt UMFPACK/MATLAB} directory, or if that directory is in your MATLAB path. This works on any system with MATLAB, including Windows. See Section~\ref{Install} for more details on how to install UMFPACK. Once installed, the UMFPACK mexFunction can analyze, factor, and solve linear systems. Table~\ref{matlab} summarizes some of the more common uses of the UMFPACK mexFunction within MATLAB. An optional input argument can be used to modify the control parameters for UMFPACK, and an optional output argument provides statistics on the factorization. Refer to the AMD User Guide for more details about the AMD mexFunction. \begin{table} \caption{Using UMFPACK's MATLAB interface} \label{matlab} \vspace{0.1in} {\footnotesize \begin{tabular}{l|l|l} \hline Function & Using UMFPACK & MATLAB 6.0 equivalent \\ \hline & & \\ \begin{minipage}[t]{1.5in} Solve $\m{Ax}=\m{b}$. \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} x = umfpack (A,'\',b) ; \end{verbatim} \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} x = A \ b ; \end{verbatim} \end{minipage} \\ & & \\ \hline & & \\ \begin{minipage}[t]{1.5in} Solve $\m{Ax}=\m{b}$ using a different row and column pre-ordering (symmetric ordering). \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} S = spones (A) ; Q = symamd (S+S') ; Control = umfpack ; Control (6) = 3 ; x = umfpack (A,Q,'\',b,Control) ; \end{verbatim} \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} spparms ('autommd',0) ; S = spones (A) ; Q = symamd (S+S') ; x = A (Q,Q) \ b (Q) ; x (Q) = x ; spparms ('autommd',1) ; \end{verbatim} \end{minipage} \\ & & \\ \hline & & \\ \begin{minipage}[t]{1.5in} Solve $\m{A}\tr\m{x}\tr = \m{b}\tr$. \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} x = umfpack (b,'/',A) ; \end{verbatim} Note: $\m{A}$ is factorized. \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} x = b / A ; \end{verbatim} Note: $\m{A}\tr$ is factorized. \end{minipage} \\ & & \\ \hline & & \\ \begin{minipage}[t]{1.5in} Scale and factorize $\m{A}$, then solve $\m{Ax}=\m{b}$. \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} [L,U,P,Q,R] = umfpack (A) ; c = P * (R \ b) ; x = Q * (U \ (L \ c)) ; \end{verbatim} \end{minipage} & \begin{minipage}[t]{2.2in} \begin{verbatim} [m n] = size (A) ; r = full (sum (abs (A), 2)) ; r (find (r == 0)) = 1 ; R = spdiags (r, 0, m, m) ; I = speye (n) ; Q = I (:, colamd (A)) ; [L,U,P] = lu ((R\A)*Q) ; c = P * (R \ b) ; x = Q * (U \ (L \ c)) ; \end{verbatim} \end{minipage} \\ & & \\ \hline \end{tabular} } \end{table} Note: in MATLAB 6.5 or later, use {\tt spparms ('autoamd',0)} in addition to {\tt spparms ('autommd',0)}, in Table~\ref{matlab}, to turn off MATLAB's default reordering. UMFPACK requires {\tt b} to be a dense vector (real or complex) of the appropriate dimension. This is more restrictive than what you can do with MATLAB's backslash or forward slash. See {\tt umfpack\_solve} for an M-file that removes this restriction. This restriction does not apply to the built-in backslash operator in MATLAB 6.5 or later, which uses UMFPACK to factorize the matrix. You can do this yourself in MATLAB: {\footnotesize \begin{verbatim} [L,U,P,Q,R] = umfpack (A) ; x = Q * (U \ (L \ (P * (R \ b)))) ; \end{verbatim} } or, with no row scaling: {\footnotesize \begin{verbatim} [L,U,P,Q] = umfpack (A) ; x = Q * (U \ (L \ (P * b))) ; \end{verbatim} } The above examples do not make use of the iterative refinement that is built into {\tt x = }{\tt umfpack (A,'}$\backslash${\tt ',b)} however. MATLAB's {\tt [L,U,P] = lu(A)} returns a lower triangular {\tt L}, an upper triangular {\tt U}, and a permutation matrix {\tt P} such that {\tt P*A} is equal to {\tt L*U}. UMFPACK behaves differently. By default, it scales the rows of {\tt A} and reorders the columns of {\tt A} prior to factorization, so that {\tt L*U} is equal to {\tt P*(R}$\backslash${\tt A)*Q}, where {\tt R} is a diagonal sparse matrix of scale factors for the rows of {\tt A}. The scale factors {\tt R} are applied to {\tt A} via the MATLAB expression {\tt R}$\backslash${\tt A} to avoid multiplying by the reciprocal, which can be numerically inaccurate. There are more options; you can provide your own column pre-ordering (in which case UMFPACK does not call COLAMD or AMD), you can modify other control settings (similar to the {\tt spparms} in MATLAB), and you can get various statistics on the analysis, factorization, and solution of the linear system. Type {\tt umfpack\_details} and {\tt umfpack\_report} in MATLAB for more information. Two demo M-files are provided. Just type {\tt umfpack\_simple} and {\tt umfpack\_demo} to run them. The output of these two programs should be about the same as the files {\tt umfpack\_simple.m.out} and {\tt umfpack\_demo.m.out} that are provided. Factorizing {\tt A'} (or {\tt A.'}) and using the transposed factors can sometimes be faster than factorizing {\tt A}. It can also be preferable to factorize {\tt A'} if {\tt A} is rectangular. UMFPACK pre-orders the columns to maintain sparsity; the row ordering is not determined until the matrix is factorized. Thus, if {\tt A} is {\tt m} by {\tt n} with rank {\tt m} and {\tt m} $<$ {\tt n}, then {\tt umfpack} might not find a factor {\tt U} with a zero-free diagonal. Unless the matrix ill-conditioned or poorly scaled, factorizing {\tt A'} in this case will guarantee that both factors will have zero-free diagonals. Here's how you can factorize {\tt A'} and get the factors of {\tt A} instead: \begin{verbatim} [l,u,p,q] = umfpack (A') ; L = u' ; U = l' ; P = q ; Q = p ; clear l u p q \end{verbatim} This is an alternative to {\tt [L,U,P,Q]=umfpack(A)}. A simple M-file ({\tt umfpack\_btf}) is provided that first permutes the matrix to upper block triangular form, using MATLAB's {\tt dmperm} routine, and then solves each block. The LU factors are not returned. Its usage is simple: {\tt x = umfpack\_btf(A,b)}. Type {\tt help umfpack\_btf} for more options. An estimate of the 1-norm of {\tt L*U-P*A*Q} can be computed in MATLAB as {\tt lu\_normest(P*A*Q,L,U)}, using the {\tt lu\_normest.m} M-file by Hager and Davis \cite{DavisHager99} that is included with the UMFPACK distribution. With row scaling enabled, use {\tt lu\_normest(P*(R}$\backslash${\tt A)*Q,L,U)} instead. One issue you may encounter is how UMFPACK allocates its memory when being used in a mexFunction. One part of its working space is of variable size. The symbolic analysis phase determines an upper bound on the size of this memory, but not all of this memory will typically be used in the numerical factorization. UMFPACK tries to allocate a decent amount of working space. This is 70\% of the upper bound, by default, for the unsymmetric strategy. For the symmetric strategy, the fraction of the upper bound is computed automatically (assuming a best-case scenario with no numerical pivoting required during numeric factorization). If this initial allocation fails, it reduces its request and uses less memory. If the space is not large enough during factorization, it is increased via {\tt mxRealloc}. However, {\tt mxMalloc} and {\tt mxRealloc} abort the {\tt umfpack} mexFunction if they fail, so this strategy does not work in MATLAB. To compute the determinant with UMFPACK: \begin{verbatim} d = umfpack (A, 'det') ; [d e] = umfpack (A, 'det') ; \end{verbatim} The first case is identical to MATLAB's {\tt det}. The second case returns the determinant in the form $d \times 10^e$, which avoids overflow if $e$ is large. %------------------------------------------------------------------------------- \section{Using UMFPACK in a C program} \label{C} %------------------------------------------------------------------------------- The C-callable UMFPACK library consists of 32 user-callable routines and one include file. All but three of the routines come in four versions, with different sizes of integers and for real or complex floating-point numbers: \begin{enumerate} \item {\tt umfpack\_di\_*}: real double precision, {\tt int} integers. \item {\tt umfpack\_dl\_*}: real double precision, {\tt UF\_long} integers. \item {\tt umfpack\_zi\_*}: complex double precision, {\tt int} integers. \item {\tt umfpack\_zl\_*}: complex double precision, {\tt UF\_long} integers. \end{enumerate} where {\tt *} denotes the specific name of one of the routines. Routine names beginning with {\tt umf\_} are internal to the package, and should not be called by the user. The include file {\tt umfpack.h} must be included in any C program that uses UMFPACK. The other three routines are the same for all four versions. In addition, the C-callable AMD library distributed with UMFPACK includes 4 user-callable routines (in two versions with {\tt int} and {\tt UF\_long} integers) and one include file. Refer to the AMD documentation for more details. Use only one version for any one problem; do not attempt to use one version to analyze the matrix and another version to factorize the matrix, for example. The notation {\tt umfpack\_di\_*} refers to all user-callable routines for the real double precision and {\tt int} integer case. The notation {\tt umfpack\_*\_numeric}, for example, refers all four versions (real/complex, int/UF\_long) of a single operation (in this case numeric factorization). %------------------------------------------------------------------------------- \subsection{The size of an integer} %------------------------------------------------------------------------------- The {\tt umfpack\_di\_*} and {\tt umfpack\_zi\_*} routines use {\tt int} integer arguments; those starting with {\tt umfpack\_dl\_} or {\tt umfpack\_zl\_} use {\tt UF\_long} integer arguments. If you compile UMFPACK in the standard ILP32 mode (32-bit {\tt int}'s, {\tt long}'s, and pointers) then the versions are essentially identical. You will be able to solve problems using up to 2GB of memory. If you compile UMFPACK in the standard LP64 mode, the size of an {\tt int} remains 32-bits, but the size of a {\tt long} and a pointer both get promoted to 64-bits. In the LP64 mode, the {\tt umfpack\_dl\_*} and {\tt umfpack\_zl\_*} routines can solve huge problems (not limited to 2GB), limited of course by the amount of available memory. The only drawback to the 64-bit mode is that not all BLAS libraries support 64-bit integers. This limits the performance you will obtain. Those that do support 64-bit integers are specific to particular architectures, and are not portable. UMFPACK and AMD should be compiled in the same mode. If you compile UMFPACK and AMD in the LP64 mode, be sure to add {\tt -DLP64} to the compilation command. See the examples in the {\tt UFconfig/UFconfig.mk} file. %------------------------------------------------------------------------------- \subsection{Real and complex floating-point} %------------------------------------------------------------------------------- The {\tt umfpack\_di\_*} and {\tt umfpack\_dl\_*} routines take (real) double precision arguments, and return double precision arguments. In the {\tt umfpack\_zi\_*} and {\tt umfpack\_zl\_*} routines, these same arguments hold the real part of the matrices; and second double precision arrays hold the imaginary part of the input and output matrices. Internally, complex numbers are stored in arrays with their real and imaginary parts interleaved, as required by the BLAS (``packed'' complex form). New to Version 4.4 is the option of providing input/output arguments in packed complex form. %------------------------------------------------------------------------------- \subsection{Primary routines, and a simple example} %------------------------------------------------------------------------------- Five primary UMFPACK routines are required to factorize $\m{A}$ or solve $\m{Ax}=\m{b}$. They are fully described in Section~\ref{Primary}: \begin{itemize} \item {\tt umfpack\_*\_symbolic}: Pre-orders the columns of $\m{A}$ to reduce fill-in. Returns an opaque {\tt Symbolic} object as a {\tt void *} pointer. The object contains the symbolic analysis and is needed for the numeric factorization. This routine requires only $O(|\m{A}|)$ space, where $|\m{A}|$ is the number of nonzero entries in the matrix. It computes upper bounds on the nonzeros in $\m{L}$ and $\m{U}$, the floating-point operations required, and the memory usage of {\tt umfpack\_*\_numeric}. The {\tt Symbolic} object is small; it contains just the column pre-ordering, the supernodal column elimination tree, and information about each frontal matrix. It is no larger than about $13n$ integers if $\m{A}$ is $n$-by-$n$. \item {\tt umfpack\_*\_numeric}: Numerically scales and then factorizes a sparse matrix into $\m{PAQ}$, $\m{PRAQ}$, or $\m{PR}^{-1}\m{AQ}$ into the product $\m{LU}$, where $\m{P}$ and $\m{Q}$ are permutation matrices, $\m{R}$ is a diagonal matrix of scale factors, $\m{L}$ is lower triangular with unit diagonal, and $\m{U}$ is upper triangular. Requires the symbolic ordering and analysis computed by {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic}. Returns an opaque {\tt Numeric} object as a {\tt void *} pointer. The object contains the numerical factorization and is used by {\tt umfpack\_*\_solve}. You can factorize a new matrix with a different values (but identical pattern) as the matrix analyzed by {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic} by re-using the {\tt Symbolic} object (this feature is available when using UMFPACK in a C or Fortran program, but not in MATLAB). The matrix $\m{U}$ will have zeros on the diagonal if $\m{A}$ is singular; this produces a warning, but the factorization is still valid. \item {\tt umfpack\_*\_solve}: Solves a sparse linear system ($\m{Ax}=\m{b}$, $\m{A}\tr\m{x}=\m{b}$, or systems involving just $\m{L}$ or $\m{U}$), using the numeric factorization computed by {\tt umfpack\_*\_numeric}. Iterative refinement with sparse backward error \cite{ardd:89} is used by default. The matrix $\m{A}$ must be square. If it is singular, then a divide-by-zero will occur, and your solution with contain IEEE Inf's or NaN's in the appropriate places. \item {\tt umfpack\_*\_free\_symbolic}: Frees the {\tt Symbolic} object created by {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic}. \item {\tt umfpack\_*\_free\_numeric}: Frees the {\tt Numeric} object created by {\tt umfpack\_*\_numeric}. \end{itemize} Be careful not to free a {\tt Symbolic} object with {\tt umfpack\_*\_free\_numeric}. Nor should you attempt to free a {\tt Numeric} object with {\tt umfpack\_*\_free\_symbolic}. Failure to free these objects will lead to memory leaks. The matrix $\m{A}$ is represented in compressed column form, which is identical to the sparse matrix representation used by MATLAB. It consists of three or four arrays, where the matrix is {\tt m}-by-{\tt n}, with {\tt nz} entries. For the {\tt int} version of UMFPACK: {\footnotesize \begin{verbatim} int Ap [n+1] ; int Ai [nz] ; double Ax [nz] ; \end{verbatim} } For the {\tt UF\_long} version of UMFPACK: {\footnotesize \begin{verbatim} UF_long Ap [n+1] ; UF_long Ai [nz] ; double Ax [nz] ; \end{verbatim} } The complex versions add another array for the imaginary part: {\footnotesize \begin{verbatim} double Az [nz] ; \end{verbatim} } Alternatively, if {\tt Az} is {\tt NULL}, the real part of the $k$th entry is located in {\tt Ax[2*k]} and the imaginary part is located in {\tt Ax[2*k+1]}, and the {\tt Ax} array is of size {\tt 2*nz}. All nonzeros are entries, but an entry may be numerically zero. The row indices of entries in column {\tt j} are stored in {\tt Ai[Ap[j]} \ldots {\tt Ap[j+1]-1]}. The corresponding numerical values are stored in {\tt Ax[Ap[j]} \ldots {\tt Ap[j+1]-1]}. The imaginary part, for the complex versions, is stored in {\tt Az[Ap[j]} \ldots {\tt Ap[j+1]-1]} (see above for the packed complex case). No duplicate row indices may be present, and the row indices in any given column must be sorted in ascending order. The first entry {\tt Ap[0]} must be zero. The total number of entries in the matrix is thus {\tt nz = Ap[n]}. Except for the fact that extra zero entries can be included, there is thus a unique compressed column representation of any given matrix $\m{A}$. For a more flexible method for providing an input matrix to UMFPACK, see Section~\ref{triplet}. Here is a simple main program, {\tt umfpack\_simple.c}, that illustrates the basic usage of UMFPACK. See Section~\ref{Synopsis} for a short description of each calling sequence, including a list of options for the first argument of {\tt umfpack\_di\_solve}. {\footnotesize \begin{verbatim} INCLUDE umfpack_simple.c via sed \end{verbatim} } The {\tt Ap}, {\tt Ai}, and {\tt Ax} arrays represent the matrix \[ \m{A} = \left[ \begin{array}{rrrrr} 2 & 3 & 0 & 0 & 0 \\ 3 & 0 & 4 & 0 & 6 \\ 0 & -1 & -3 & 2 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 4 & 2 & 0 & 1 \\ \end{array} \right]. \] and the solution to $\m{Ax}=\m{b}$ is $\m{x} = [1 \, 2 \, 3 \, 4 \, 5]\tr$. The program uses default control settings and does not return any statistics about the ordering, factorization, or solution ({\tt Control} and {\tt Info} are both {\tt (double *) NULL}). It also ignores the status value returned by most user-callable UMFPACK routines. %------------------------------------------------------------------------------- \subsection{A note about zero-sized arrays} %------------------------------------------------------------------------------- UMFPACK uses many user-provided arrays of size {\tt m} or {\tt n} (the order of the matrix), and of size {\tt nz} (the number of nonzeros in a matrix). UMFPACK does not handle zero-dimensioned arrays; it returns an error code if {\tt m} or {\tt n} are zero. However, {\tt nz} can be zero, since all singular matrices are handled correctly. If you attempt to {\tt malloc} an array of size {\tt nz} = 0, however, {\tt malloc} will return a null pointer which UMFPACK will report as a missing argument. If you {\tt malloc} an array of size {\tt nz} to pass to UMFPACK, make sure that you handle the {\tt nz} = 0 case correctly (use a size equal to the maximum of {\tt nz} and 1, or use a size of {\tt nz+1}). %------------------------------------------------------------------------------- \subsection{Alternative routines} %------------------------------------------------------------------------------- Three alternative routines are provided that modify UMFPACK's default behavior. They are fully described in Section~\ref{Alternative}: \begin{itemize} \item {\tt umfpack\_*\_defaults}: Sets the default control parameters in the {\tt Control} array. These can then be modified as desired before passing the array to the other UMFPACK routines. Control parameters are summarized in Section~\ref{control_param}. Three particular parameters deserve special notice. UMFPACK uses relaxed partial pivoting, where a candidate pivot entry is numerically acceptable if its magnitude is greater than or equal to a tolerance parameter times the magnitude of the largest entry in the same column. The parameter {\tt Control [UMFPACK\_PIVOT\_TOLERANCE]} has a default value of 0.1, and is used for the unsymmetric strategy. For complex matrices, a cheap approximation of the absolute value is used for the threshold pivoting test ($|a| \approx |a_{\mbox{real}}|+|a_{\mbox{imag}}|$). For the symmetric strategy, a second tolerance is used for diagonal entries: \newline {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]}, with a default value of 0.001. The first parameter (with a default of 0.1) is used for any off-diagonal candidate pivot entries. These two parameters may be too small for some matrices, particularly for ill-conditioned or poorly scaled ones. With the default pivot tolerances and default iterative refinement, {\tt x = umfpack (A,'}$\backslash${\tt ',b)} is just as accurate as (or more accurate) than {\tt x = A}$\backslash${\tt b} in MATLAB 6.1 for nearly all matrices. If {\tt Control [UMFPACK\_PIVOT\_TOLERANCE]} is zero, than any nonzero entry is acceptable as a pivot (this is changed from Version 4.0, which treated a value of 0.0 the same as 1.0). If the symmetric strategy is used, and {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]} is zero, then any nonzero entry on the diagonal is accepted as a pivot. Off-diagonal pivoting will still occur if the diagonal entry is exactly zero. The {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]} parameter is new to Version 4.1. It is similar in function to the pivot tolerance for left-looking methods (the MATLAB {\tt THRESH} option in {\tt [L,U,P] = lu (A, THRESH)}, and the pivot tolerance parameter in SuperLU). The parameter {\tt Control [UMFPACK\_STRATEGY]} can be used to bypass UMFPACK's automatic strategy selection. The automatic strategy nearly always selects the best method. When it does not, the different methods nearly always give about the same quality of results. There may be cases where the automatic strategy fails to pick a good strategy. Also, you can save some computing time if you know the right strategy for your set of matrix problems. \item {\tt umfpack\_*\_qsymbolic}: An alternative to {\tt umfpack\_*\_symbolic}. Allows the user to specify his or her own column pre-ordering, rather than using the default COLAMD or AMD pre-orderings. For example, a graph partitioning-based order of $\m{A}\tr\m{A}$ would be suitable for UMFPACK's unsymmetric strategy. A partitioning of $\m{A}+\m{A}\tr$ would be suitable for UMFPACK's symmetric or 2-by-2 strategies. \item {\tt umfpack\_*\_wsolve}: An alternative to {\tt umfpack\_*\_solve} which does not dynamically allocate any memory. Requires the user to pass two additional work arrays. \end{itemize} %------------------------------------------------------------------------------- \subsection{Matrix manipulation routines} \label{triplet} %------------------------------------------------------------------------------- The compressed column data structure is compact, and simplifies the UMFPACK routines that operate on the sparse matrix $\m{A}$. However, it can be inconvenient for the user to generate. Section~\ref{Manipulate} presents the details of routines for manipulating sparse matrices in {\em triplet} form, compressed column form, and compressed row form (the transpose of the compressed column form). The triplet form of a matrix consists of three or four arrays. For the {\tt int} version of UMFPACK: {\footnotesize \begin{verbatim} int Ti [nz] ; int Tj [nz] ; double Tx [nz] ; \end{verbatim} } For the {\tt UF\_long} version: {\footnotesize \begin{verbatim} UF_long Ti [nz] ; UF_long Tj [nz] ; double Tx [nz] ; \end{verbatim} } The complex versions use another array to hold the imaginary part: {\footnotesize \begin{verbatim} double Tz [nz] ; \end{verbatim} } The {\tt k}-th triplet is $(i,j,a_{ij})$, where $i =$ {\tt Ti[k]}, $j =$ {\tt Tj[k]}, and $a_{ij} =$ {\tt Tx[k]}. For the complex versions, {\tt Tx[k]} is the real part of $a_{ij}$ and {\tt Tz[k]} is the imaginary part. The triplets can be in any order in the {\tt Ti}, {\tt Tj}, and {\tt Tx} arrays (and {\tt Tz} for the complex versions), and duplicate entries may exist. If {\tt Tz} is NULL, then the array {\tt Tx} becomes of size {\tt 2*nz}, and the real and imaginary parts of the {\tt k}-th triplet are located in {\tt Tx[2*k]} and {\tt Tx[2*k+1]}, respectively. Any duplicate entries are summed when the triplet form is converted to compressed column form. This is a convenient way to create a matrix arising in finite-element methods, for example. Four routines are provided for manipulating sparse matrices: \begin{itemize} \item {\tt umfpack\_*\_triplet\_to\_col}: Converts a triplet form of a matrix to compressed column form (ready for input to \newline {\tt umfpack\_*\_symbolic}, {\tt umfpack\_*\_qsymbolic}, and {\tt umfpack\_*\_numeric}). Identical to {\tt A = spconvert(i,j,x)} in MATLAB, except that zero entries are not removed, so that the pattern of entries in the compressed column form of $\m{A}$ are fully under user control. This is important if you want to factorize a new matrix with the {\tt Symbolic} object from a prior matrix with the same pattern as the new one. \item {\tt umfpack\_*\_col\_to\_triplet}: The opposite of {\tt umfpack\_*\_triplet\_to\_col}. Identical to {\tt [i,j,x] = find(A)} in MATLAB, except that numerically zero entries may be included. \item {\tt umfpack\_*\_transpose}: Transposes and optionally permutes a column form matrix \cite{Gustavson78}. Identical to {\tt R = A(P,Q)'} (linear algebraic transpose, using the complex conjugate) or {\tt R = A(P,Q).'} (the array transpose) in MATLAB, except for the presence of numerically zero entries. Factorizing $\m{A}\tr$ and then solving $\m{Ax}=\m{b}$ with the transposed factors can sometimes be much faster or much slower than factorizing $\m{A}$. It is highly dependent on your particular matrix. \item {\tt umfpack\_*\_scale}: Applies the row scale factors to a user-provided vector. This is not required to solve the sparse linear system $\m{Ax}=\m{b}$ or $\m{A}\tr\m{x}=\m{b}$, since {\tt umfpack\_*\_solve} applies the scale factors for those systems. \end{itemize} It is quite easy to add matrices in triplet form, subtract them, transpose them, permute them, construct a submatrix, and multiply a triplet-form matrix times a vector. UMFPACK does not provide code for these basic operations, however. Refer to the discussion of {\tt umfpack\_*\_triplet\_to\_col} in Section~\ref{Manipulate} for more details on how to compute these operations in your own code. The only primary matrix operation not provided by UMFPACK is the multiplication of two sparse matrices \cite{Gustavson78}. The CHOLMOD provides many of these matrix operations, which can then be used in conjunction with UMFPACK. See my web page for details. %------------------------------------------------------------------------------- \subsection{Getting the contents of opaque objects} %------------------------------------------------------------------------------- There are cases where you may wish to do more with the LU factorization of a matrix than solve a linear system. The opaque {\tt Symbolic} and {\tt Numeric} objects are just that - opaque. You cannot do anything with them except to pass them back to subsequent calls to UMFPACK. Three routines are provided for copying their contents into user-provided arrays using simpler data structures. Four routines are provided for saving and loading the {\tt Numeric} and {\tt Symbolic} objects to/from binary files. An additional routine is provided that computes the determinant. They are fully described in Section~\ref{Get}: \begin{itemize} \item {\tt umfpack\_*\_get\_lunz}: Returns the number of nonzeros in $\m{L}$ and $\m{U}$. \item {\tt umfpack\_*\_get\_numeric}: Copies $\m{L}$, $\m{U}$, $\m{P}$, $\m{Q}$, and $\m{R}$ from the {\tt Numeric} object into arrays provided by the user. The matrix $\m{L}$ is returned in compressed row form (with the column indices in each row sorted in ascending order). The matrix $\m{U}$ is returned in compressed column form (with sorted columns). There are no explicit zero entries in $\m{L}$ and $\m{U}$, but such entries may exist in the {\tt Numeric} object. The permutations $\m{P}$ and $\m{Q}$ are represented as permutation vectors, where {\tt P[k] = i} means that row {\tt i} of the original matrix is the the {\tt k}-th row of $\m{PAQ}$, and where {\tt Q[k] = j} means that column {\tt j} of the original matrix is the {\tt k}-th column of $\m{PAQ}$. This is identical to how MATLAB uses permutation vectors (type {\tt help colamd} in MATLAB 6.1 or later). \item {\tt umfpack\_*\_get\_symbolic}: Copies the contents of the {\tt Symbolic} object (the initial row and column preordering, supernodal column elimination tree, and information about each frontal matrix) into arrays provided by the user. \item {\tt umfpack\_*\_get\_determinant}: Computes the determinant from the diagonal of $\m{U}$ and the permutations $\m{P}$ and $\m{Q}$. This is mostly of theoretical interest. It is not a good test to determine if your matrix is singular or not. \item {\tt umfpack\_*\_save\_numeric}: Saves a copy of the {\tt Numeric} object to a file, in binary format. \item {\tt umfpack\_*\_load\_numeric}: Creates a {\tt Numeric} object by loading it from a file created by {\tt umfpack\_*\_save\_numeric}. \item {\tt umfpack\_*\_save\_symbolic}: Saves a copy of the {\tt Symbolic} object to a file, in binary format. \item {\tt umfpack\_*\_load\_symbolic}: Creates a {\tt Symbolic} object by loading it from a file created by {\tt umfpack\_*\_save\_symbolic}. \end{itemize} UMFPACK itself does not make use of these routines; they are provided solely for returning the contents of the opaque {\tt Symbolic} and {\tt Numeric} objects to the user, and saving/loading them to/from a binary file. None of them do any computation, except for {\tt umfpack\_*\_get\_determinant}. %------------------------------------------------------------------------------- \subsection{Reporting routines} \label{Reporting} %------------------------------------------------------------------------------- None of the UMFPACK routines discussed so far prints anything, even when an error occurs. UMFPACK provides you with nine routines for printing the input and output arguments (including the {\tt Control} settings and {\tt Info} statistics) of UMFPACK routines discussed above. They are fully described in Section~\ref{Report}: \begin{itemize} \item {\tt umfpack\_*\_report\_status}: Prints the status (return value) of other {\tt umfpack\_*} routines. \item {\tt umfpack\_*\_report\_info}: Prints the statistics returned in the {\tt Info} array by {\tt umfpack\_*\_*symbolic}, {\tt umfpack\_*\_numeric}, and {\tt umfpack\_*\_*solve}. \item {\tt umfpack\_*\_report\_control}: Prints the {\tt Control} settings. \item {\tt umfpack\_*\_report\_matrix}: Verifies and prints a compressed column-form or compressed row-form sparse matrix. \item {\tt umfpack\_*\_report\_triplet}: Verifies and prints a matrix in triplet form. \item {\tt umfpack\_*\_report\_symbolic}: Verifies and prints a {\tt Symbolic} object. \item {\tt umfpack\_*\_report\_numeric}: Verifies and prints a {\tt Numeric} object. \item {\tt umfpack\_*\_report\_perm}: Verifies and prints a permutation vector. \item {\tt umfpack\_*\_report\_vector}: Verifies and prints a real or complex vector. \end{itemize} The {\tt umfpack\_*\_report\_*} routines behave slightly differently when compiled into the C-callable UMFPACK library than when used in the MATLAB mexFunction. MATLAB stores its sparse matrices using the same compressed column data structure discussed above, where row and column indices of an $m$-by-$n$ matrix are in the range 0 to $m-1$ or $n-1$, respectively\footnote{Complex matrices in MATLAB use the split array form, with one {\tt double} array for the real part and another array for the imaginary part. UMFPACK supports that format, as well as the packed complex format (new to Version 4.4).} It prints them as if they are in the range 1 to $m$ or $n$. The UMFPACK mexFunction behaves the same way. You can control how much the {\tt umfpack\_*\_report\_*} routines print by modifying the {\tt Control [UMFPACK\_PRL]} parameter. Its default value is 1. Here is a summary of how the routines use this print level parameter: \begin{itemize} \item {\tt umfpack\_*\_report\_status}: No output if the print level is 0 or less, even when an error occurs. If 1, then error messages are printed, and nothing is printed if the status is {\tt UMFPACK\_OK}. A warning message is printed if the matrix is singular. If 2 or more, then the status is always printed. If 4 or more, then the UMFPACK Copyright is printed. If 6 or more, then the UMFPACK License is printed. See also the first page of this User Guide for the Copyright and License. \item {\tt umfpack\_*\_report\_control}: No output if the print level is 1 or less. If 2 or more, all of {\tt Control} is printed. \item {\tt umfpack\_*\_report\_info}: No output if the print level is 1 or less. If 2 or more, all of {\tt Info} is printed. \item all other {\tt umfpack\_*\_report\_*} routines: If the print level is 2 or less, then these routines return silently without checking their inputs. If 3 or more, the inputs are fully verified and a short status summary is printed. If 4, then the first few entries of the input arguments are printed. If 5, then all of the input arguments are printed. \end{itemize} This print level parameter has an additional effect on the MATLAB mexFunction. If zero, then no warnings of singular or nearly singular matrices are printed (similar to the MATLAB commands {\tt warning off MATLAB:singularMatrix} and {\tt warning off MATLAB:nearlySingularMatrix}). %------------------------------------------------------------------------------- \subsection{Utility routines} %------------------------------------------------------------------------------- UMFPACK v4.0 included a routine that returns the time used by the process, {\tt umfpack\_timer}. The routine uses either {\tt getrusage} (which is preferred), or the ANSI C {\tt clock} routine if that is not available. It is fully described in Section~\ref{Utility}. It is still available in UMFPACK v4.1 and following, but not used internally. Two new timing routines are provided in UMFPACK Version 4.1 and following, {\tt umfpack\_tic} and {\tt umfpack\_toc}. They use POSIX-compliant {\tt sysconf} and {\tt times} routines to find both the CPU time and wallclock time. These three routines are the only user-callable routine that is identical in all four {\tt int}/{\tt UF\_long}, real/complex versions (there is no {\tt umfpack\_di\_timer} routine, for example). %------------------------------------------------------------------------------- \subsection{Control parameters} \label{control_param} %------------------------------------------------------------------------------- UMFPACK uses an optional {\tt double} array (currently of size 20) to modify its control parameters. If you pass {\tt (double *) NULL} instead of a {\tt Control} array, then defaults are used. These defaults provide nearly optimal performance (both speed, memory usage, and numerical accuracy) for a wide range of matrices from real applications. This array will almost certainly grow in size in future releases, so be sure to dimension your {\tt Control} array to be of size {\tt UMFPACK\_CONTROL}. That constant is currently defined to be 20, but may increase in future versions, since all 20 entries are in use. The contents of this array may be modified by the user (see {\tt umfpack\_*\_defaults}). Each user-callable routine includes a complete description of how each control setting modifies its behavior. Table~\ref{control} summarizes the entire contents of the {\tt Control} array. Note that ANSI C uses 0-based indexing, while MATLAB uses 1-based indexing. Thus, {\tt Control(1)} in MATLAB is the same as {\tt Control[0]} or {\tt Control[UMFPACK\_PRL]} in ANSI C. \begin{table} \caption{UMFPACK Control parameters} \label{control} {\footnotesize \begin{tabular}{llll} \hline MATLAB & ANSI C & default & description \\ \hline {\tt Control(1)} & {\tt Control[UMFPACK\_PRL]} & 1 & printing level \\ {\tt Control(2)} & {\tt Control[UMFPACK\_DENSE\_ROW]} & 0.2 & dense row parameter \\ {\tt Control(3)} & {\tt Control[UMFPACK\_DENSE\_COL]} & 0.2 & dense column parameter \\ {\tt Control(4)} & {\tt Control[UMFPACK\_PIVOT\_TOLERANCE]} & 0.1 & partial pivoting tolerance \\ {\tt Control(5)} & {\tt Control[UMFPACK\_BLOCK\_SIZE]} & 32 & BLAS block size \\ {\tt Control(6)} & {\tt Control[UMFPACK\_STRATEGY]} & 0 (auto) & select strategy \\ {\tt Control(7)} & {\tt Control[UMFPACK\_ALLOC\_INIT]} & 0.7 & initial memory allocation \\ {\tt Control(8)} & {\tt Control[UMFPACK\_IRSTEP]} & 2 & max iter. refinement steps \\ {\tt Control(13)} & {\tt Control[UMFPACK\_2BY2\_TOLERANCE]} & 0.01 & defines ``large'' entries \\ {\tt Control(14)} & {\tt Control[UMFPACK\_FIXQ]} & 0 (auto) & fix or modify Q \\ {\tt Control(15)} & {\tt Control[UMFPACK\_AMD\_DENSE]} & 10 & AMD dense row/column parameter \\ {\tt Control(16)} & {\tt Control[UMFPACK\_SYM\_PIVOT\_TOLERANCE]} & 0.001 & for diagonal entries \\ {\tt Control(17)} & {\tt Control[UMFPACK\_SCALE]} & 1 (sum) & row scaling (none, sum, or max) \\ {\tt Control(18)} & {\tt Control[UMFPACK\_FRONT\_ALLOC\_INIT]} & 0.5 & frontal matrix allocation ratio \\ {\tt Control(19)} & {\tt Control[UMFPACK\_DROPTOL]} & 0 & drop tolerance \\ {\tt Control(20)} & {\tt Control[UMFPACK\_AGGRESSIVE]} & 1 (yes) & aggressive absorption \\ & & & in AMD and COLAMD \\ % \hline \multicolumn{4}{l}{Can only be changed at compile time:} \\ {\tt Control(9)} & {\tt Control[UMFPACK\_COMPILED\_WITH\_BLAS]} & - & true if BLAS is used \\ {\tt Control(10)} & {\tt Control[UMFPACK\_COMPILED\_FOR\_MATLAB]} & - & true for mexFunction \\ {\tt Control(11)} & {\tt Control[UMFPACK\_COMPILED\_WITH\_GETRUSAGE]} & - & 1 if {\tt getrusage} used \\ {\tt Control(12)} & {\tt Control[UMFPACK\_COMPILED\_IN\_DEBUG\_MODE]} & - & true if debug mode enabled \\ \hline \end{tabular} } \end{table} Let $\alpha_r = ${\tt Control [UMFPACK\_DENSE\_ROW]}, $\alpha_c = ${\tt Control [UMFPACK\_DENSE\_COL]}, and $\alpha = ${\tt Control [UMFPACK\_AMD\_DENSE]}. Suppose the submatrix $\m{S}$, obtained after eliminating pivots with zero Markowitz cost, is $m$-by-$n$. Then a row is considered ``dense'' if it has more than $\max (16, 16 \alpha_r \sqrt{n})$ entries. A column is considered ``dense'' if it has more than $\max (16, 16 \alpha_c \sqrt{m})$ entries. These rows and columns are treated different in COLAMD and during numerical factorization. In COLAMD, dense columns are placed last in their natural order, and dense rows are ignored. During numerical factorization, dense rows are stored differently. In AMD, a row/column of the square matrix $\m{S}+\m{S}\tr$ is considered ``dense'' if it has more than $\max (16, \alpha \sqrt{n})$ entries. These rows/columns are placed last in AMD's output ordering. For more details on the control parameters, refer to the documentation of {\tt umfpack\_*\_qsymbolic}, {\tt umfpack\_*\_numeric}, {\tt umfpack\_*\_solve}, and the {\tt umfpack\_*\_report\_*} routines, in Sections~\ref{Primary}~through~\ref{Report}, below. %------------------------------------------------------------------------------- \subsection{Error codes} \label{error_codes} %------------------------------------------------------------------------------- Many of the routines return a {\tt status} value. This is also returned as the first entry in the {\tt Info} array, for those routines with that argument. The following list summarizes all of the error codes in UMFPACK. Each error code is given a specific name in the {\tt umfpack.h} include file, so you can use those constants instead of hard-coded values in your program. Future versions may report additional error codes. A value of zero means everything was successful, and the matrix is non-singular. A value greater than zero means the routine was successful, but a warning occurred. A negative value means the routine was not successful. In this case, no {\tt Symbolic} or {\tt Numeric} object was created. \begin{itemize} \item {\tt UMFPACK\_OK}, (0): UMFPACK was successful. \item {\tt UMFPACK\_WARNING\_singular\_matrix}, (1): Matrix is singular. There are exact zeros on the diagonal of $\m{U}$. \item {\tt UMFPACK\_WARNING\_determinant\_underflow}, (2): The determinant is nonzero, but smaller in magnitude than the smallest positive floating-point number. \item {\tt UMFPACK\_WARNING\_determinant\_overflow}, (3): The determinant is larger in magnitude than the largest positive floating-point number (IEEE Inf). \item {\tt UMFPACK\_ERROR\_out\_of\_memory}, (-1): Not enough memory. The ANSI C {\tt malloc} or {\tt realloc} routine failed. \item {\tt UMFPACK\_ERROR\_invalid\_Numeric\_object}, (-3): Routines that take a {\tt Numeric} object as input (or load it from a file) check this object and return this error code if it is invalid. This can be caused by a memory leak or overrun in your program, which can overwrite part of the Numeric object. It can also be caused by passing a Symbolic object by mistake, or some other pointer. If you try to factorize a matrix using one version of UMFPACK and then use the factors in another version, this error code will trigger as well. You cannot factor your matrix using version 4.0 and then solve with version 4.1, for example.\footnote{ Exception: v4.3, v4.3.1, and v4.4 use identical data structures for the {\tt Numeric} and {\tt Symbolic} objects}. You cannot use different precisions of the same version (real and complex, for example). It is possible for the {\tt Numeric} object to be corrupted by your program in subtle ways that are not detectable by this quick check. In this case, you may see an {\tt UMFPACK\_ERROR\_different\_pattern} error code, or even an {\tt UMFPACK\_ERROR\_internal\_error}. \item {\tt UMFPACK\_ERROR\_invalid\_Symbolic\_object}, (-4): Routines that take a {\tt Symbolic} object as input (or load it from a file) check this object and return this error code if it is invalid. The causes of this error are analogous to the {\tt UMFPACK\_ERROR\_invalid\_Numeric\_object} error described above. \item {\tt UMFPACK\_ERROR\_argument\_missing}, (-5): Some arguments of some are optional (you can pass a {\tt NULL} pointer instead of an array). This error code occurs if you pass a {\tt NULL} pointer when that argument is required to be present. \item {\tt UMFPACK\_ERROR\_n\_nonpositive} (-6): The number of rows or columns of the matrix must be greater than zero. \item {\tt UMFPACK\_ERROR\_invalid\_matrix} (-8): The matrix is invalid. For the column-oriented input, this error code will occur if the contents of {\tt Ap} and/or {\tt Ai} are invalid. {\tt Ap} is an integer array of size {\tt n\_col+1}. On input, it holds the ``pointers'' for the column form of the sparse matrix $\m{A}$. Column {\tt j} of the matrix A is held in {\tt Ai [(Ap [j])} \ldots {\tt (Ap [j+1]-1)]}. The first entry, {\tt Ap [0]}, must be zero, and {\tt Ap [j]} $\le$ {\tt Ap [j+1]} must hold for all {\tt j} in the range 0 to {\tt n\_col-1}. The value {\tt nz = Ap [n\_col]} is thus the total number of entries in the pattern of the matrix A. {\tt nz} must be greater than or equal to zero. The nonzero pattern (row indices) for column {\tt j} is stored in {\tt Ai [(Ap [j])} \ldots {\tt (Ap [j+1]-1)]}. The row indices in a given column {\tt j} must be in ascending order, and no duplicate row indices may be present. Row indices must be in the range 0 to {\tt n\_row-1} (the matrix is 0-based). Some routines take a triplet-form input, with arguments {\tt nz}, {\tt Ti}, and {\tt Tj}. This error code is returned if {\tt nz} is less than zero, if any row index in {\tt Ti} is outside the range 0 to {\tt n\_col-1}, or if any column index in {\tt Tj} is outside the range 0 to {\tt n\_row-1}. \item {\tt UMFPACK\_ERROR\_different\_pattern}, (-11): The most common cause of this error is that the pattern of the matrix has changed between the symbolic and numeric factorization. It can also occur if the {\tt Numeric} or {\tt Symbolic} object has been subtly corrupted by your program. \item {\tt UMFPACK\_ERROR\_invalid\_system}, (-13): The {\tt sys} argument provided to one of the solve routines is invalid. \item {\tt UMFPACK\_ERROR\_invalid\_permutation}, (-15): The permutation vector provided as input is invalid. \item {\tt UMFPACK\_ERROR\_file\_IO}, (-17): This error code is returned by the routines that save and load the {\tt Numeric} or {\tt Symbolic} objects to/from a file, if a file I/O error has occurred. The file may not exist or may not be readable, you may be trying to create a file that you don't have permission to create, or you may be out of disk space. The file you are trying to read might be the wrong one, and an earlier end-of-file condition would then result in this error. \item {\tt UMFPACK\_ERROR\_internal\_error}, (-911): An internal error has occurred, of unknown cause. This is either a bug in UMFPACK, or the result of a memory overrun from your program. Try modifying the file {\tt AMD/Include/amd\_internal.h} and adding the statement {\tt \#undef NDEBUG}, to enable the debugging mode. Recompile UMFPACK and rerun your program. A failed assertion might occur which can give you a better indication as to what is going wrong. Be aware that UMFPACK will be extraordinarily slow when running in debug mode. If all else fails, contact the developer (davis@cise.ufl.edu) with as many details as possible. \end{itemize} %------------------------------------------------------------------------------- \subsection{Larger examples} %------------------------------------------------------------------------------- Full examples of all user-callable UMFPACK routines are available in four stand-alone C main programs, {\tt umfpack\_*\_demo.c}. Another example is the UMFPACK mexFunction, {\tt umfpackmex.c}. The mexFunction accesses only the user-callable C interface to UMFPACK. The only features that it does not use are the support for the triplet form (MATLAB's sparse arrays are already in the compressed column form) and the ability to reuse the {\tt Symbolic} object to numerically factorize a matrix whose pattern is the same as a prior matrix analyzed by {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic}. The latter is an important feature, but the mexFunction does not return its opaque {\tt Symbolic} and {\tt Numeric} objects to MATLAB. Instead, it gets the contents of these objects after extracting them via the {\tt umfpack\_*\_get\_*} routines, and returns them as MATLAB sparse matrices. The {\tt umf4.c} program for reading matrices in Harwell/Boeing format \cite{DuffGrimesLewis87b} is provided. It requires three Fortran 77 programs ({\tt readhb.f}, {\tt readhb\_nozeros.f}, and {\tt readhb\_size.f}) for reading in the sample Harwell/Boeing files in the {\tt UMFPACK/Demo/HB} directory. More matrices are available at http://www.cise.ufl.edu/research/sparse/matrices. Type {\tt make hb} in the {\tt UMFPACK/Demo/HB} directory to compile and run this demo. This program was used for the experimental results in \cite{Davis03}. %------------------------------------------------------------------------------- \section{Synopsis of C-callable routines} \label{Synopsis} %------------------------------------------------------------------------------- Each subsection, below, summarizes the input variables, output variables, return values, and calling sequences of the routines in one category. Variables with the same name as those already listed in a prior category have the same size and type. The real, {\tt UF\_long} integer {\tt umfpack\_dl\_*} routines are identical to the real, {\tt int} routines, except that {\tt \_di\_} is replaced with {\tt \_dl\_} in the name, and all {\tt int} arguments become {\tt UF\_long}. Similarly, the complex, {\tt UF\_long} integer {\tt umfpack\_zl\_*} routines are identical to the complex, {\tt int} routines, except that {\tt \_zi\_} is replaced with {\tt \_zl\_} in the name, and all {\tt int} arguments become {\tt UF\_long}. Only the real and complex {\tt int} versions are listed in the synopsis below. The matrix $\m{A}$ is {\tt m}-by-{\tt n} with {\tt nz} entries. The {\tt sys} argument of {\tt umfpack\_*\_solve} is an integer in the range 0 to 14 which defines which linear system is to be solved. \footnote{Integer values for {\tt sys} are used instead of strings (as in LINPACK and LAPACK) to avoid C-to-Fortran portability issues.} Valid values are listed in Table~\ref{sys}. The notation $\m{A}\he$ refers to the matrix transpose, which is the complex conjugate transpose for complex matrices ({\tt A'} in MATLAB). The array transpose is $\m{A}\tr$, which is {\tt A.'} in MATLAB. \begin{table} \begin{center} \caption{UMFPACK {\tt sys} parameter} \label{sys} {\footnotesize \begin{tabular}{ll|l} \hline Value & & system \\ \hline & & \\ {\tt UMFPACK\_A} & (0) & $\m{Ax}=\m{b}$ \\ {\tt UMFPACK\_At} & (1) & $\m{A}\he\m{x}=\m{b}$ \\ {\tt UMFPACK\_Aat} & (2) & $\m{A}\tr\m{x}=\m{b}$ \\ & & \\ \hline & & \\ {\tt UMFPACK\_Pt\_L} & (3) & $\m{P}\tr\m{Lx}=\m{b}$ \\ {\tt UMFPACK\_L} & (4) & $\m{Lx}=\m{b}$ \\ {\tt UMFPACK\_Lt\_P} & (5) & $\m{L}\he\m{Px}=\m{b}$ \\ {\tt UMFPACK\_Lat\_P} & (6) & $\m{L}\tr\m{Px}=\m{b}$ \\ {\tt UMFPACK\_Lt} & (7) & $\m{L}\he\m{x}=\m{b}$ \\ {\tt UMFPACK\_Lat} & (8) & $\m{L}\tr\m{x}=\m{b}$ \\ & & \\ \hline & & \\ {\tt UMFPACK\_U\_Qt} & (9) & $\m{UQ}\tr\m{x}=\m{b}$ \\ {\tt UMFPACK\_U} & (10) & $\m{Ux}=\m{b}$ \\ {\tt UMFPACK\_Q\_Ut} & (11) & $\m{QU}\he\m{x}=\m{b}$ \\ {\tt UMFPACK\_Q\_Uat} & (12) & $\m{QU}\tr\m{x}=\m{b}$ \\ {\tt UMFPACK\_Ut} & (13) & $\m{U}\he\m{x}=\m{b}$ \\ {\tt UMFPACK\_Uat} & (14) & $\m{U}\tr\m{x}=\m{b}$ \\ & & \\ \hline \end{tabular} } \end{center} \end{table} %------------------------------------------------------------------------------- \subsection{Primary routines: real/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} #include "umfpack.h" int status, sys, n, m, nz, Ap [n+1], Ai [nz] ; double Control [UMFPACK_CONTROL], Info [UMFPACK_INFO], Ax [nz], X [n], B [n] ; void *Symbolic, *Numeric ; status = umfpack_di_symbolic (m, n, Ap, Ai, Ax, &Symbolic, Control, Info) ; status = umfpack_di_numeric (Ap, Ai, Ax, Symbolic, &Numeric, Control, Info) ; status = umfpack_di_solve (sys, Ap, Ai, Ax, X, B, Numeric, Control, Info) ; umfpack_di_free_symbolic (&Symbolic) ; umfpack_di_free_numeric (&Numeric) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Alternative routines: real/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} int Qinit [n], Wi [n] ; double W [5*n] ; umfpack_di_defaults (Control) ; status = umfpack_di_qsymbolic (m, n, Ap, Ai, Ax, Qinit, &Symbolic, Control, Info) ; status = umfpack_di_wsolve (sys, Ap, Ai, Ax, X, B, Numeric, Control, Info, Wi, W) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Matrix manipulation routines: real/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} int Ti [nz], Tj [nz], P [m], Q [n], Rp [m+1], Ri [nz], Map [nz] ; double Tx [nz], Rx [nz], Y [m], Z [m] ; status = umfpack_di_col_to_triplet (n, Ap, Tj) ; status = umfpack_di_triplet_to_col (m, n, nz, Ti, Tj, Tx, Ap, Ai, Ax, Map) ; status = umfpack_di_transpose (m, n, Ap, Ai, Ax, P, Q, Rp, Ri, Rx) ; status = umfpack_di_scale (Y, Z, Numeric) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Getting the contents of opaque objects: real/{\tt int}} %------------------------------------------------------------------------------- The {\tt filename} string should be large enough to hold the name of a file. {\footnotesize \begin{verbatim} int lnz, unz, Lp [m+1], Lj [lnz], Up [n+1], Ui [unz], do_recip ; double Lx [lnz], Ux [unz], D [min (m,n)], Rs [m], Mx [1], Ex [1] ; int nfr, nchains, P1 [m], Q1 [n], Front_npivcol [n+1], Front_parent [n+1], Front_1strow [n+1], Front_leftmostdesc [n+1], Chain_start [n+1], Chain_maxrows [n+1], Chain_maxcols [n+1] ; char filename [100] ; status = umfpack_di_get_lunz (&lnz, &unz, &m, &n, &nz_udiag, Numeric) ; status = umfpack_di_get_numeric (Lp, Lj, Lx, Up, Ui, Ux, P, Q, D, &do_recip, Rs, Numeric) ; status = umfpack_di_get_symbolic (&m, &n, &n1, &nz, &nfr, &nchains, P1, Q1, Front_npivcol, Front_parent, Front_1strow, Front_leftmostdesc, Chain_start, Chain_maxrows, Chain_maxcols, Symbolic) ; status = umfpack_di_load_numeric (&Numeric, filename) ; status = umfpack_di_save_numeric (Numeric, filename) ; status = umfpack_di_load_symbolic (&Symbolic, filename) ; status = umfpack_di_save_symbolic (Symbolic, filename) ; status = umfapck_di_get_determinant (Mx, Ex, Numeric, Info) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Reporting routines: real/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} umfpack_di_report_status (Control, status) ; umfpack_di_report_control (Control) ; umfpack_di_report_info (Control, Info) ; status = umfpack_di_report_matrix (m, n, Ap, Ai, Ax, 1, Control) ; status = umfpack_di_report_matrix (m, n, Rp, Ri, Rx, 0, Control) ; status = umfpack_di_report_numeric (Numeric, Control) ; status = umfpack_di_report_perm (m, P, Control) ; status = umfpack_di_report_perm (n, Q, Control) ; status = umfpack_di_report_symbolic (Symbolic, Control) ; status = umfpack_di_report_triplet (m, n, nz, Ti, Tj, Tx, Control) ; status = umfpack_di_report_vector (n, X, Control) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Primary routines: complex/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} double Az [nz], Xx [n], Xz [n], Bx [n], Bz [n] ; status = umfpack_zi_symbolic (m, n, Ap, Ai, Ax, Az, &Symbolic, Control, Info) ; status = umfpack_zi_numeric (Ap, Ai, Ax, Az, Symbolic, &Numeric, Control, Info) ; status = umfpack_zi_solve (sys, Ap, Ai, Ax, Az, Xx, Xz, Bx, Bz, Numeric, Control, Info) ; umfpack_zi_free_symbolic (&Symbolic) ; umfpack_zi_free_numeric (&Numeric) ; \end{verbatim} } The arrays {\tt Ax}, {\tt Bx}, and {\tt Xx} double in size if any imaginary argument ({\tt Az}, {\tt Xz}, or {\tt Bz}) is {\tt NULL}. %------------------------------------------------------------------------------- \subsection{Alternative routines: complex/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} double Wz [10*n] ; umfpack_zi_defaults (Control) ; status = umfpack_zi_qsymbolic (m, n, Ap, Ai, Ax, Az, Qinit, &Symbolic, Control, Info) ; status = umfpack_zi_wsolve (sys, Ap, Ai, Ax, Az, Xx, Xz, Bx, Bz, Numeric, Control, Info, Wi, Wz) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{Matrix manipulation routines: complex/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} double Tz [nz], Rz [nz], Yx [m], Yz [m], Zx [m], Zz [m] ; status = umfpack_zi_col_to_triplet (n, Ap, Tj) ; status = umfpack_zi_triplet_to_col (m, n, nz, Ti, Tj, Tx, Tz, Ap, Ai, Ax, Az, Map) ; status = umfpack_zi_transpose (m, n, Ap, Ai, Ax, Az, P, Q, Rp, Ri, Rx, Rz, 1) ; status = umfpack_zi_transpose (m, n, Ap, Ai, Ax, Az, P, Q, Rp, Ri, Rx, Rz, 0) ; status = umfpack_zi_scale (Yx, Yz, Zx, Zz, Numeric) ; \end{verbatim} } The arrays {\tt Tx}, {\tt Rx}, {\tt Yx}, and {\tt Zx} double in size if any imaginary argument ({\tt Tz}, {\tt Rz}, {\tt Yz}, or {\tt Zz}) is {\tt NULL}. %------------------------------------------------------------------------------- \subsection{Getting the contents of opaque objects: complex/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} double Lz [lnz], Uz [unz], Dx [min (m,n)], Dz [min (m,n)], Mz [1] ; status = umfpack_zi_get_lunz (&lnz, &unz, &m, &n, &nz_udiag, Numeric) ; status = umfpack_zi_get_numeric (Lp, Lj, Lx, Lz, Up, Ui, Ux, Uz, P, Q, Dx, Dz, &do_recip, Rs, Numeric) ; status = umfpack_zi_get_symbolic (&m, &n, &n1, &nz, &nfr, &nchains, P1, Q1, Front_npivcol, Front_parent, Front_1strow, Front_leftmostdesc, Chain_start, Chain_maxrows, Chain_maxcols, Symbolic) ; status = umfpack_zi_load_numeric (&Numeric, filename) ; status = umfpack_zi_save_numeric (Numeric, filename) ; status = umfpack_zi_load_symbolic (&Symbolic, filename) ; status = umfpack_zi_save_symbolic (Symbolic, filename) ; status = umfapck_zi_get_determinant (Mx, Mz, Ex, Numeric, Info) ; \end{verbatim} } The arrays {\tt Lx}, {\tt Ux}, {\tt Dx}, and {\tt Mx} double in size if any imaginary argument ({\tt Lz}, {\tt Uz}, {\tt Dz}, or {\tt Mz}) is {\tt NULL}. %------------------------------------------------------------------------------- \subsection{Reporting routines: complex/{\tt int}} %------------------------------------------------------------------------------- {\footnotesize \begin{verbatim} umfpack_zi_report_status (Control, status) ; umfpack_zi_report_control (Control) ; umfpack_zi_report_info (Control, Info) ; status = umfpack_zi_report_matrix (m, n, Ap, Ai, Ax, Az, 1, Control) ; status = umfpack_zi_report_matrix (m, n, Rp, Ri, Rx, Rz, 0, Control) ; status = umfpack_zi_report_numeric (Numeric, Control) ; status = umfpack_zi_report_perm (m, P, Control) ; status = umfpack_zi_report_perm (n, Q, Control) ; status = umfpack_zi_report_symbolic (Symbolic, Control) ; status = umfpack_zi_report_triplet (m, n, nz, Ti, Tj, Tx, Tz, Control) ; status = umfpack_zi_report_vector (n, Xx, Xz, Control) ; \end{verbatim} } The arrays {\tt Ax}, {\tt Rx}, {\tt Tx}, and {\tt Xx} double in size if any imaginary argument ({\tt Az}, {\tt Rz}, {\tt Tz}, or {\tt Xz}) is {\tt NULL}. %------------------------------------------------------------------------------- \subsection{Utility routines} %------------------------------------------------------------------------------- These routines are the same in all four versions of UMFPACK. {\footnotesize \begin{verbatim} double t, s [2] ; t = umfpack_timer ( ) ; umfpack_tic (s) ; umfpack_toc (s) ; \end{verbatim} } %------------------------------------------------------------------------------- \subsection{AMD ordering routines} %------------------------------------------------------------------------------- UMFPACK makes use of the AMD ordering package for its symmetric ordering strategy. You may also use these four user-callable routines in your own C programs. You need to include the {\tt amd.h} file only if you make direct calls to the AMD routines themselves. The {\tt int} versions are summarized below; {\tt UF\_long} versions are also available. Refer to the AMD User Guide for more information, or to the file {\tt amd.h} which documents these routines. {\footnotesize \begin{verbatim} #include "amd.h" double amd_control [AMD_CONTROL], amd_info [AMD_INFO] ; amd_defaults (amd_control) ; status = amd_order (n, Ap, Ai, P, amd_control, amd_info) ; amd_control (amd_control) ; amd_info (amd_info) ; \end{verbatim} } %------------------------------------------------------------------------------- \section{Using UMFPACK in a Fortran program} %------------------------------------------------------------------------------- UMFPACK includes a basic Fortran 77 interface to some of the C-callable UMFPACK routines. Since interfacing C and Fortran programs is not portable, this interface might not work with all C and Fortran compilers. Refer to Section~\ref{Install} for more details. The following Fortran routines are provided. The list includes the C-callable routines that the Fortran interface routine calls. Refer to the corresponding C routines in Section~\ref{C} for more details on what the Fortran routine does. \begin{itemize} \item {\tt umf4def}: sets the default control parameters ({\tt umfpack\_di\_defaults}). \item {\tt umf4sym}: pre-ordering and symbolic factorization ({\tt umfpack\_di\_symbolic}). \item {\tt umf4num}: numeric factorization ({\tt umfpack\_di\_numeric}). \item {\tt umf4solr}: solve a linear system with iterative refinement ({\tt umfpack\_di\_solve}). \item {\tt umf4sol}: solve a linear system without iterative refinement ({\tt umfpack\_di\_solve}). Sets {\tt Control [UMFPACK\_IRSTEP]} to zero, and does not require the matrix $\m{A}$. \item {\tt umf4scal}: scales a vector using UMFPACK's scale factors ({\tt umfpack\_di\_scale}). \item {\tt umf4fnum}: free the {\tt Numeric} object ({\tt umfpack\_di\_free\_numeric}). \item {\tt umf4fsym}: free the {\tt Symbolic} object ({\tt umfpack\_di\_free\_symbolic}). \item {\tt umf4pcon}: prints the control parameters ({\tt umfpack\_di\_report\_control}). \item {\tt umf4pinf}: print statistics ({\tt umfpack\_di\_report\_info}). \item {\tt umf4snum}: save the {\tt Numeric} object to a file ({\tt umfpack\_di\_save\_numeric}). \item {\tt umf4ssym}: save the {\tt Symbolic} object to a file ({\tt umfpack\_di\_save\_symbolic}). \item {\tt umf4lnum}: load the {\tt Numeric} object from a file ({\tt umfpack\_di\_load\_numeric}). \item {\tt umf4lsym}: load the {\tt Symbolic} object from a file ({\tt umfpack\_di\_load\_symbolic}). \end{itemize} The matrix $\m{A}$ is passed to UMFPACK in compressed column form, with 0-based indices. In Fortran, for an {\tt m}-by-{\tt n} matrix $\m{A}$ with {\tt nz} entries, the row indices of the first column (column 1) are in {\tt Ai (Ap(1)+1} \ldots {\tt Ap(2))}, with values in {\tt Ax (Ap(1)+1} \ldots {\tt Ap(2))}. The last column (column {\tt n}) is in {\tt Ai (Ap(n)+1} \ldots {\tt Ap(n+1))} and {\tt Ax (Ap(n)+1} \ldots {\tt Ap(n+1))}. The number of entries in the matrix is thus {\tt nz = Ap (n+1)}. The row indices in {\tt Ai} are in the range 0 to {\tt m}-1. They must be sorted, with no duplicate entries allowed. None of the UMFPACK routines modify the input matrix $\m{A}$. The following definitions apply for the Fortran routines: {\footnotesize \begin{verbatim} integer m, n, Ap (n+1), Ai (nz), symbolic, numeric, filenum, status double precision Ax (nz), control (20), info (90), x (n), b (n) \end{verbatim} } UMFPACK's status is returned in either a {\tt status} argument, or in {\tt info (1)}. It is zero if UMFPACK was successful, 1 if the matrix is singular (this is a warning, not an error), and negative if an error occurred. Section~\ref{error_codes} summarizes the possible values of {\tt status} and {\tt info (1)}. See Table~\ref{sys} for a list of the values of the {\tt sys} argument. See Table~\ref{control} for a list of the control parameters (the Fortran usage is the same as the MATLAB usage for this array). For the {\tt Numeric} and {\tt Symbolic} handles, it is probably safe to assume that a Fortran {\tt integer} is sufficient to store a C pointer. If that does not work, try defining {\tt numeric} and {\tt symbolic} in your Fortran program as integer arrays of size 2. You will need to define them as {\tt integer*8} if you compile UMFPACK in the 64-bit mode. To avoid passing strings between C and Fortran in the load/save routines, a file number is passed instead, and the C interface constructs a file name (if {\tt filenum} is 42, the {\tt Numeric} file name is {\tt n42.umf}, and the {\tt Symbolic} file name is {\tt s42.umf}). The following is a summary of the calling sequence of each Fortran interface routine. An example of their use is in the {\tt Demo/umf4hb.f} file. That routine also includes an example of how to convert a 1-based sparse matrix into 0-based form. For more details on the arguments of each routine, refer to the arguments of the same name in the corresponding C-callable routine, in Sections~\ref{Primary}~through~\ref{Utility}. The only exception is the {\tt control} argument of {\tt umf4sol}, which sets {\tt control (8)} to zero to disable iterative refinement. Note that the solve routines do not overwrite {\tt b} with the solution, but return their solution in a different array, {\tt x}. {\footnotesize \begin{verbatim} call umf4def (control) call umf4sym (m, n, Ap, Ai, Ax, symbolic, control, info) call umf4num (Ap, Ai, Ax, symbolic, numeric, control, info) call umf4solr (sys, Ap, Ai, Ax, x, b, numeric, control, info) call umf4sol (sys, x, b, numeric, control, info) call umf4scal (x, b, numeric, status) call umf4fnum (numeric) call umf4fsym (symbolic) call umf4pcon (control) call umf4pinf (control) call umf4snum (numeric, filenum, status) call umf4ssym (symbolic, filenum, status) call umf4lnum (numeric, filenum, status) call umf4lsym (symbolic, filenum, status) \end{verbatim} } Access to the complex routines in UMFPACK is provided by the interface routines in {\tt umf4\_f77zwrapper.c}. The following is a synopsis of each routine. All the arguments are the same as the real versions, except {\tt Az}, {\tt xz}, and {\tt bz} are the imaginary parts of the matrix, solution, and right-hand side, respectively. The {\tt Ax}, {\tt x}, and {\tt b} are the real parts. {\footnotesize \begin{verbatim} call umf4zdef (control) call umf4zsym (m, n, Ap, Ai, Ax, Az, symbolic, control, info) call umf4znum (Ap, Ai, Ax, Az, symbolic, numeric, control, info) call umf4zsolr (sys, Ap, Ai, Ax, Az, x, xz, b, bz, numeric, control, info) call umf4zsol (sys, x, xz, b, bz, numeric, control, info) call umf4zscal (x, xz, b, bz, numeric, status) call umf4zfnum (numeric) call umf4zfsym (symbolic) call umf4zpcon (control) call umf4zpinf (control) call umf4zsnum (numeric, filenum, status) call umf4zssym (symbolic, filenum, status) call umf4zlnum (numeric, filenum, status) call umf4zlsym (symbolic, filenum, status) \end{verbatim} } The Fortran interface does not support the packed complex case. %------------------------------------------------------------------------------- \section{Installation} \label{Install} %------------------------------------------------------------------------------- %------------------------------------------------------------------------------- \subsection{Installing the C library} %------------------------------------------------------------------------------- The following discussion assumes you have the {\tt make} program, either in Unix, or in Windows with Cygwin\footnote{www.cygwin.com}. You can skip this section and go to next one if all you want to use is the UMFPACK and AMD mexFunctions in MATLAB. You will need to install both UMFPACK v5.0 and AMD v2.0 to use UMFPACK. The {\tt UMFPACK} and {\tt AMD} subdirectories must be placed side-by-side within the same directory. AMD is a stand-alone package that is required by UMFPACK. UMFPACK can be compiled without the BLAS \cite{DaydeDuff99,ACM679a,ATLAS,GotoVandeGeijn02}, but your performance will be much less than what it should be. System-dependent configurations are in the {\tt UFconfig/UFconfig.mk} file. The default settings will work on most systems, except that UMFPACK will be compiled so that it does not use the BLAS. Sample configurations are provided for Linux, Sun Solaris, SGI IRIX, IBM AIX, and the DEC/Compaq Alpha. To compile and install both packages, go to the {\tt UMFPACK} directory and type {\tt make}. This will compile the libraries ({\tt AMD/Lib/libamd.a} and {\tt UMFPACK/Lib/libumfpack.a}). A demo of the AMD ordering routine will be compiled and tested in the {\tt AMD/Demo} directory, and five demo programs will then be compiled and tested in the {\tt UMFPACK/Demo} directory. The outputs of these demo programs will then be compared with output files in the distribution. Expect to see a few differences, such as residual norms, compile-time control settings, and perhaps memory usage differences. To use {\tt make} to compile the MATLAB mexFunctions for MATLAB and AMD, you can either type {\tt make mex} in the UMFPACK directory. You may first need to edit the {\tt UFconfig/UFconfig.mk} file to modify the definition of the {\tt MEX}, if you have a version of MATLAB older than Version 7.2. Remove the {\tt -largeArrayDims} definition. If you use the MATLAB command {\tt umfpack\_make} in the MATLAB directory, then this case is handled for you automatically. If you have the GNU version of {\tt make}, the {\tt Lib/GNUmakefile} and {\tt MATLAB/GNUmakefile} files are used. These are much more concise than what the ``old'' version of {\tt make} can handle. If you do not have GNU {\tt make}, the {\tt Lib/Makefile} and {\tt MATLAB/Makefile} files are used instead. Each UMFPACK source file is compiled into four versions ({\tt double} / complex, and {\tt int} / {\tt UF\_long}). A proper old-style {\tt Makefile} is cumbersome in this case, so these two {\tt Makefile}'s have been constructed by brute force. They ignore dependencies, and simply compile everything. I highly recommend using GNU {\tt make} if you wish to modify UMFPACK. If you compile UMFPACK and AMD and then later change the {\tt UFconfig/UFconfig.mk} file then you should type {\tt make purge} and then {\tt make} to recompile. Here are the various parameters that you can control in your {\tt UFconfig/UFconfig.mk} file: \begin{itemize} \item {\tt CC = } your C compiler, such as {\tt cc}. \item {\tt RANLIB = } your system's {\tt ranlib} program, if needed. \item {\tt CFLAGS = } optimization flags, such as {\tt -O}. \item {\tt UMFPACK\_CONFIG = } configuration settings for the BLAS, memory allocation routines, and timing routines. \item {\tt LIB = } your libraries, such as {\tt -lm} or {\tt -lblas}. \item {\tt RM =} the command to delete a file. \item {\tt MV =} the command to rename a file. \item {\tt MEX =} the command to compile a MATLAB mexFunction. If you are using MATLAB 5, you need to add {\tt -DNBLAS} and {\tt -DNUTIL} to this command. \item {\tt F77 =} the command to compile a Fortran program (optional). \item {\tt F77FLAGS =} the Fortran compiler flags (optional). \item {\tt F77LIB =} the Fortran libraries (optional). \end{itemize} The {\tt UMFPACK\_CONFIG} string can include combinations of the following; most deal with how the BLAS are called: \begin{itemize} \item {\tt -DNBLAS} if you do not have any BLAS at all. \item {\tt -DNSUNPERF} if you are on Solaris but do not have the Sun Performance Library (for the BLAS). \item {\tt -DLONGBLAS} if your BLAS takes non-{\tt int} integer arguments. \item {\tt -DBLAS\_INT = } the integer used by the BLAS. \item {\tt -DBLAS\_NO\_UNDERSCORE} for controlling how C calls the Fortran BLAS. This is set automatically for Windows, Sun Solaris, SGI Irix, Red Hat Linux, Compaq Alpha, and AIX (the IBM RS 6000). \item {\tt -DGETRUSAGE} if you have the {\tt getrusage} function. \item {\tt -DNPOSIX} if you do not have the POSIX-compliant {\tt sysconf} and {\tt times} routines used by {\tt umfpack\_tic} and {\tt umfpack\_toc}. \item {\tt -DNRECIPROCAL} controls a trade-off between speed and accuracy. If defined (or if the pivot value itself is less than $10^{-12}$), then the pivot column is divided by the pivot value during numeric factorization. Otherwise, it is multiplied by the reciprocal of the pivot, which is faster but can be less accurate. The default is to multiply by the reciprocal unless the pivot value is small. This option also modifies how the rows of the matrix $\m{A}$ are scaled. If {\tt -DNRECIPROCAL} is defined (or if any scale factor is less than $10^{-12}$), entries in the rows of $\m{A}$ are divided by the scale factors. Otherwise, they are multiplied by the reciprocal. When compiling the complex routines with the GNU {\tt gcc} compiler, the pivot column is always divided by the pivot entry, because of a numerical accuracy issue encountered with {\tt gcc} version 3.2 with a few complex matrices on a Pentium 4M (running Linux). You can still use {\tt -DNRECIPROCAL} to control how the scale factors for the rows of $\m{A}$ are applied. \item {\tt -DNO\_DIVIDE\_BY\_ZERO} controls how UMFPACK treats zeros on the diagonal of $\m{U}$, for a singular matrix $\m{A}$. If defined, then no division by zero is performed (a zero entry on the diagonal of $\m{U}$ is treated as if it were equal to one). By default, UMFPACK will divide by zero. \item {\tt -DNO\_TIMER} controls whether or not timing routines are to be called. If defined, no timers are used. Timers are included by default. \end{itemize} If a Fortran BLAS package is used you may see compiler warnings. The BLAS routines {\tt dgemm}, {\tt dgemv}, {\tt dger}, {\tt dtrsm}, {\tt dtrsv}, {\tt dscal} and their corresponding complex versions are used. Header files are not provided for the Fortran BLAS. You may safely ignore all of these warnings. I highly recommend the recent BLAS by Goto and van de Geijn \cite{GotoVandeGeijn02}. Using this BLAS increased the performance of UMFPACK by up to 50\% on a Dell Latitude C840 laptop (2GHz Pentium 4M, 512K L2 cache, 1GB main memory). The peak performance of {\tt umfpack\_di\_numeric} with Goto and van de Geijn's BLAS is 1.6 Gflops on this computer. In MATLAB, the peak performance of UMFPACK on a dense matrix (stored in sparse format) is 900 Mflops, as compared to 1 Gflop for {\tt x = A}$\backslash${\tt b} when {\tt A} is stored as a regular full matrix. When you compile your program that uses the C-callable UMFPACK library, you need to link your program with both libraries ({\tt UMFPACK/Lib/libumfpack.a} and {\tt AMD/Lib/libamd.a}) and you need to tell your compiler to look in the directories {\tt UMFPACK/Include} and {\tt AMD/Include} for include files. See {\tt UMFPACK/Demo/Makefile} for an example. You do not need to directly include any AMD include files in your program, unless you directly call AMD routines. You only need the \begin{verbatim} #include "umfpack.h" \end{verbatim} statement, as described in Section~\ref{Synopsis}. If you would like to compile both 32-bit and 64-bit versions of the libraries, you will need to do it in two steps. Modify your {\tt UFconfig/UFconfig.mk} file, and select the 32-bit option. Type {\tt make} in the {\tt UMFPACK} directory, which creates the {\tt UMFPACK/Lib/libumfpack.a} and {\tt AMD/Lib/libamd.a} libraries. Rename those two files. Edit your {\tt UFconfig/UFconfig.mk} file and select the 64-bit option. Type {\tt make purge}, and then {\tt make}, and you will create the 64-bit libraries. You can use the same {\tt umfpack.h} include file for both 32-bit and 64-bit versions. Simply link your program with the appropriate 32-bit or 64-bit compiled version of the UMFPACK and AMD libraries. Type {\tt make hb} in the {\tt UMFPACK/Demo/HB} directory to compile and run a C program that reads in and factorizes Harwell/Boeing matrices. Note that this uses a stand-alone Fortran program to read in the Fortran-formatted Harwell/Boeing matrices and write them to a file which can be read by a C program. The {\tt umf\_multicompile.c} file has been added to assist in the compilation of UMFPACK in Microsoft Visual Studio, for Windows. %------------------------------------------------------------------------------- \subsection{Installing the MATLAB interface} %------------------------------------------------------------------------------- If all you want to do is use the UMFPACK mexFunction in MATLAB, you can skip the use of the {\tt make} command described above. Simply type {\tt umfpack\_make} in MATLAB while in the {\tt UMFPACK/MATLAB} directory. You can also type {\tt amd\_make} in the {\tt AMD/MATLAB} directory to compile the stand-alone AMD mexFunction (this is not required to compile the UMFPACK mexFunction). This works on any computer with MATLAB, including Windows. % You will be prompted to select several configuration options, including % whether or not to use the BLAS. % MATLAB 5.3 (or earlier) does not include the BLAS, so you either have to % compile UMFPACK without the BLAS (UMFPACK will be slow), or modify your % {\tt /bin/mexopts.sh} by adding your BLAS library % to the {\tt CLIBS} string, % where {\tt } is the directory in which MATLAB is installed. If you are using Windows and the {\tt lcc} compiler bundled with MATLAB 6.1, then you may need to copy the {\tt UMFPACK}$\backslash${\tt MATLAB}$\backslash${\tt lcc\_lib}$\backslash${\tt libmwlapack.lib} file into the {\tt }$\backslash${\tt extern}$\backslash${\tt lib}$\backslash${\tt win32}$\backslash${\tt lcc}$\backslash$ directory. Next, type {\tt mex -setup} at the MATLAB prompt, and ask MATLAB to select the {\tt lcc} compiler. MATLAB 6.1 has built-in BLAS, but in that version of MATLAB the BLAS cannot be accessed by a mexFunction compiled by {\tt lcc} without first copying this file to the location listed above. If you have MATLAB 6.5 or later, you can probably skip this step. %------------------------------------------------------------------------------- \subsection{Installing the Fortran interface} %------------------------------------------------------------------------------- Once the 32-bit C-callable UMFPACK library is compiled, you can also compile the Fortran interface, by typing {\tt make fortran}. This will create the {\tt umf4hb} program, test it, and compare the output with the file {\tt umf4hb.out} in the distribution. If you compiled UMFPACK in 64-bit mode, you need to use {\tt make fortran64} instead, which compiles the {\tt umf4hb64} program and compares its output with the file {\tt umf4hb64.out}. Refer to the comments in the {\tt Demo/umf4\_f77wrapper.c} file for more details. This interface is {\bf highly} non-portable, since it depends on how C and Fortran are interfaced. Because of this issue, the interface is included in the {\tt Demo} directory, and not as a primary part of the UMFPACK library. The interface routines are not included in the compiled {\tt UMFPACK/Lib/libumfpack.a} library, but left as stand-alone compiled files ({\tt umf4\_f77wrapper.o} and {\tt umf4\_f77wrapper64.o} in the {\tt Demo} directory). You may need to modify the interface routines in the file {\tt umf4\_f77wrapper.c} if you are using compilers for which this interface has not been tested. %------------------------------------------------------------------------------- \subsection{Known Issues} %------------------------------------------------------------------------------- The Microsoft C or C++ compilers on a Pentium badly break the IEEE 754 standard, and do not treat NaN's properly. According to IEEE 754, the expression {\tt (x != x)} is supposed to be true if and only if {\tt x} is NaN. For non-compliant compilers in Windows that expression is always false, and another test must be used: {\tt (x < x)} is true if and only if {\tt x} is NaN. For compliant compilers, {\tt (x < x)} is always false, for any value of {\tt x} (including NaN). To cover both cases, UMFPACK when running under Microsoft Windows defines the following macro, which is true if and only if {\tt x} is NaN, regardless of whether your compiler is compliant or not: \begin{verbatim} #define SCALAR_IS_NAN(x) (((x) != (x)) || ((x) < (x))) \end{verbatim} If your compiler breaks this test, then UMFPACK will fail catastrophically if it encounters a NaN. You will not just see NaN's in your output; UMFPACK will probably crash with a segmentation fault. In that case, you might try to see if the common (but non-ANSI C) routine {\tt isnan} is available, and modify the macro {\tt SCALAR\_IS\_NAN} in {\tt umf\_version.h} accordingly. The simpler (and IEEE 754-compliant) test {\tt (x != x)} is always true with Linux on a PC, and on every Unix compiler I have tested. Some compilers will complain about the Fortran BLAS being defined implicitly. C prototypes for the BLAS are not used, except the C-BLAS. Some compilers will complain about unrecognized {\tt \#pragma}'s. You may safely ignore all of these warnings. %------------------------------------------------------------------------------- \section{Future work} \label{Future} %------------------------------------------------------------------------------- Here are a few features that are not in the current version of UMFPACK, in no particular order. They may appear in a future release of UMFPACK. If you are interested, let me know and I could consider including them: \begin{enumerate} \item Remove the restriction that the column-oriented form be given with sorted columns. This has already been done in AMD Version 2.0. \item Future versions may have different default {\tt Control} parameters. Future versions may return more statistics in the {\tt Info} array, and they may use more entries in the {\tt Control} array. These two arrays will probably become larger, since there are very few unused entries. If they change in size, the constants {\tt UMFPACK\_CONTROL} and {\tt UMFPACK\_INFO} defined in {\tt umfpack.h} will be changed to reflect their new size. Your C program should use these constants when declaring the size of these two arrays. Do not define them as {\tt Control [20]} and {\tt Info [90]}. \item Forward/back solvers for the conventional row or column-form data structure for $\m{L}$ and $\m{U}$ (the output of {\tt umfpack\_*\_di\_get\_numeric}). This would enable a separate solver that could be used to write a MATLAB mexFunction {\tt x = lu\_refine (A, b, L, U, P, Q, R)} that gives MATLAB access to the iterative refinement algorithm with sparse backward error analysis. It would also be easier to handle sparse right-hand sides in this data structure, and end up with good asymptotic run-time in this case (particularly for $\m{Lx}=\m{b}$; see \cite{GilbertPeierls88}). See also CSparse and CXSparse for software for handling sparse right-hand sides. \item Complex absolute value computations could be based on FDLIBM (see \newline http://www.netlib.org/fdlibm), using the {\tt hypot(x,y)} routine. \item When using iterative refinement, the residual $\m{Ax}-\m{b}$ could be returned by {\tt umfpack\_solve}. \item The solve routines could handle multiple right-hand sides, and sparse right-hand sides. See {\tt umfpack\_solve} for the MATLAB version of this feature. See also CSparse and CXSparse for software for handling sparse right-hand sides. \item An option to redirect the error and diagnostic output. \item Permutation to block-triangular-form \cite{Duff78a} for the C-callable interface. There are two routines in the ACM Collected Algorithms (529 and 575) \cite{Duff81b,Duff78b} that could be translated from Fortran to C and included in UMFPACK. This would result in better performance for matrices from circuit simulation and chemical process engineering. See {\tt umfpack\_btf.m} for the MATLAB version of this feature. KLU includes this feature. See also {\tt cs\_dmperm} in CSparse and CXSparse. \item The ability to use user-provided work arrays, so that {\tt malloc}, {\tt free}, and {\tt realloc} realloc are not called. The {\tt umfpack\_*\_wsolve} routine is one example. \item A method that takes time proportional to the number of nonzeros in $\m{A}$ to compute the symbolic factorization \cite{GilbertNgPeyton94}. This would improve the performance of the symmetric and 2-by-2 strategies, and the unsymmetric strategy when dense rows are present. The current method takes time proportional to the number of nonzeros in the upper bound of $\m{U}$. The method used in UMFPACK exploits super-columns, however, so this bound is rarely reached. See {\tt cs\_counts} in CSparse and CXSparse, and {\tt cholmod\_analyze} in CHOLMOD. \item Other basic sparse matrix operations, such as sparse matrix multiplication, could be included. \item A more complete Fortran interface. \item A C++ interface. \item A parallel version using MPI. This would require a large amount of effort. \end{enumerate} %------------------------------------------------------------------------------- \newpage \section{The primary UMFPACK routines} \label{Primary} %------------------------------------------------------------------------------- The include files are the same for all four versions of UMFPACK. The generic integer type is {\tt Int}, which is an {\tt int} or {\tt UF\_long}, depending on which version of UMFPACK you are using. \subsection{umfpack\_*\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_numeric.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_solve} {\footnotesize \begin{verbatim} INCLUDE umfpack_solve.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_free\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_free_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_free\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_free_numeric.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage \section{Alternative routines} \label{Alternative} %------------------------------------------------------------------------------- \subsection{umfpack\_*\_defaults} {\footnotesize \begin{verbatim} INCLUDE umfpack_defaults.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_qsymbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_qsymbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_wsolve} {\footnotesize \begin{verbatim} INCLUDE umfpack_wsolve.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage \section{Matrix manipulation routines} \label{Manipulate} %------------------------------------------------------------------------------- \subsection{umfpack\_*\_col\_to\_triplet} {\footnotesize \begin{verbatim} INCLUDE umfpack_col_to_triplet.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_triplet\_to\_col} {\footnotesize \begin{verbatim} INCLUDE umfpack_triplet_to_col.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_transpose} {\footnotesize \begin{verbatim} INCLUDE umfpack_transpose.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_scale} {\footnotesize \begin{verbatim} INCLUDE umfpack_scale.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage \section{Getting the contents of opaque objects} \label{Get} %------------------------------------------------------------------------------- \subsection{umfpack\_*\_get\_lunz} {\footnotesize \begin{verbatim} INCLUDE umfpack_get_lunz.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_get\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_get_numeric.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_get\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_get_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_save\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_save_numeric.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_load\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_load_numeric.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_save\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_save_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_load\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_load_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_get\_determinant} {\footnotesize \begin{verbatim} INCLUDE umfpack_get_determinant.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage \section{Reporting routines} \label{Report} %------------------------------------------------------------------------------- \subsection{umfpack\_*\_report\_status} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_status.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_control} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_control.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_info} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_info.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_matrix} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_matrix.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_numeric} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_numeric.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_perm} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_perm.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_symbolic} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_symbolic.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_triplet} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_triplet.h via sed \end{verbatim} } \newpage \subsection{umfpack\_*\_report\_vector} {\footnotesize \begin{verbatim} INCLUDE umfpack_report_vector.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage \section{Utility routines} \label{Utility} %------------------------------------------------------------------------------- \subsection{umfpack\_timer} {\footnotesize \begin{verbatim} INCLUDE umfpack_timer.h via sed \end{verbatim} } \newpage \subsection{umfpack\_tic and umfpack\_toc} {\footnotesize \begin{verbatim} INCLUDE umfpack_tictoc.h via sed \end{verbatim} } %------------------------------------------------------------------------------- \newpage % References %------------------------------------------------------------------------------- \bibliographystyle{plain} \bibliography{UserGuide} \end{document}