Alternating Poisson Regression for fitting CP to sparse count data
References:
- E. C. Chi, T. G. Kolda, On Tensors, Sparsity, and Nonnegative Factorizations, SIAM J. Matrix Analysis and Applications, 33:1272-1299, 2012, https://doi.org/10.1137/110859063
- S. Hansen, T. Plantenga and T. G. Kolda, Newton-Based Optimization for Kullback-Leibler Nonnegative Tensor Factorizations, Optimization Methods and Software, 30(5):955-979, 2015, http://dx.doi.org/10.1080/10556788.2015.1009977
Contents
Set up a sample problem
We follow the general procedure for creating a problem outlined by Chi and Kolda (2012). This creates a sparse count tensor with a known solution. The solution is a CP decomposition with a few large entries in each column of the factor matrices. The solution is normalized and sorted by component size in descending order.
rng('default') %<- Setting random seed for reproducibility of this script % Pick the size and rank sz = [100 80 60]; R = 5; % Generate factor matrices with a few large entries in each column; this % will be the basis of our soln. A = cell(3,1); for n = 1:length(sz) A{n} = rand(sz(n), R); for r = 1:R p = randperm(sz(n)); nbig = round( (1/R)*sz(n) ); A{n}(p(1:nbig),r) = 100 * A{n}(p(1:nbig),r); end end lambda = rand(R,1); S = ktensor(lambda, A); S = normalize(S,'sort',1); % Create sparse test problem based on provided solution. nz = prod(sz) * .05; info = create_problem('Soln', S, 'Sparse_Generation', nz); % Extract data and solution X = info.Data; M_true = info.Soln;
Call CP-APR
Alternating Poisson Regression (APR) is a method for fitting a CP decomposition to sparse count data. It is a nonnegative method that minimizes the Kullback-Leibler divergence between the data and the model. The method is implemented in the cp_apr function, which is similar to the cp_als function, but uses a different objective function and optimization method. The cp_apr function is designed to handle sparse count data and is particularly useful for fitting nonnegative CP decompositions.
The cp_apr function is a wrapper that calls one of three specific algorithms, selected by the 'alg' parameter:
- 'pqnr' (Default): Row subproblems are solved by Projected Quasi-Newton with L-BFGS. This method generally offers a good balance of speed and robustness and is suitable for a wide range of problems. It approximates the Hessian using gradient information from previous iterations. It is based on the work by Hansen, Plantenga, and Kolda (2015).
- 'pdnr': Row subproblems are solved by Projected Damped Newton. This method uses the exact Hessian for the row subproblems, which can lead to higher accuracy per iteration but may be more computationally intensive, especially for large R, as it involves forming and solving an R x R system at each inner iteration for each row. It is based on the work by Hansen, Plantenga, and Kolda (2015).
- 'mu': Multiplicative Update. This is a simpler algorithm, often with cheaper iterations. It can be slower to converge to high accuracy compared to Newton-based methods but can be effective for very large, sparse problems or for obtaining an initial guess quickly. It is based on the work by Chi & Kolda (2012).
The following example uses the default 'pqnr' algorithm.
% Compute a solution using the default 'pqnr' algorithm fprintf('--- Running CP-APR with PQNR (default) ---\n'); M_pqnr = cp_apr(X, R, 'printitn', 10); % Score the solution (a score of 1 is perfect) factor_match_score_pqnr = score(M_pqnr, M_true, 'greedy', true)
--- Running CP-APR with PQNR (default) ---
CP_PQNR (alternating Poisson regression using quasi-Newton)
Precomputing sparse index sets...done
10. Ttl Inner Its: 1576, KKT viol = 2.00e-01, obj = 1.43584130e+04, nz: 291
20. Ttl Inner Its: 336, KKT viol = 6.45e-03, obj = 1.25598428e+04, nz: 279
===========================================
Final f = 1.255984e+04
Final least squares fit = 5.232259e-01
Final KKT violation = 9.5452289e-05
Total inner iterations = 36709
Total execution time = 1.30 secs
factor_match_score_pqnr =
0.9745
Example using the 'pdnr' algorithm
Here, we explicitly select the 'pdnr' algorithm. We also reduce the maximum number of iterations for this demonstration.
fprintf('--- Running CP-APR with PDNR ---\n'); M_pdnr = cp_apr(X, R, 'alg', 'pdnr', 'printitn', 5, 'maxiters', 50); % Score the solution factor_match_score_pdnr = score(M_pdnr, M_true, 'greedy', true)
--- Running CP-APR with PDNR ---
CP_PDNR (alternating Poisson regression using damped Newton)
Precomputing sparse index sets...done
5. Ttl Inner Its: 1396, KKT viol = 2.95e+01, obj = 1.53165862e+04, nz: 290
10. Ttl Inner Its: 520, KKT viol = 3.67e-01, obj = 1.43750374e+04, nz: 283
15. Ttl Inner Its: 442, KKT viol = 9.47e-01, obj = 1.43681008e+04, nz: 284
20. Ttl Inner Its: 461, KKT viol = 3.25e-01, obj = 1.43555648e+04, nz: 292
25. Ttl Inner Its: 247, KKT viol = 2.25e+00, obj = 1.25980093e+04, nz: 265
30. Ttl Inner Its: 450, KKT viol = 1.45e-02, obj = 1.25593395e+04, nz: 280
35. Ttl Inner Its: 257, KKT viol = 2.48e-03, obj = 1.25593387e+04, nz: 281
40. Ttl Inner Its: 249, KKT viol = 4.21e-04, obj = 1.25593387e+04, nz: 281
===========================================
Final f = 1.255934e+04
Final least squares fit = 5.232251e-01
Final KKT violation = 9.8393337e-05
Total inner iterations = 18511
Total execution time = 1.28 secs
factor_match_score_pdnr =
0.9745
Example using the 'mu' algorithm
This example demonstrates the 'mu' algorithm. We can also set parameters specific to 'mu', like 'kappa'.
fprintf('--- Running CP-APR with MU ---\n'); M_mu = cp_apr(X, R, 'alg', 'mu', 'printitn', 20, 'maxiters', 200, 'kappa', 50); % Score the solution factor_match_score_mu = score(M_mu, M_true, 'greedy', true)
--- Running CP-APR with MU ---
CP_APR:
Iter 20: Inner Its = 25 KKT violation = 1.177537e-02, nViolations = 0
Iter 40: Inner Its = 12 KKT violation = 1.177746e-02, nViolations = 0
Iter 60: Inner Its = 12 KKT violation = 1.175529e-02, nViolations = 0
Iter 80: Inner Its = 12 KKT violation = 1.152947e-02, nViolations = 0
Iter 100: Inner Its = 13 KKT violation = 1.005199e-02, nViolations = 0
Iter 120: Inner Its = 13 KKT violation = 4.297555e-03, nViolations = 0
Iter 140: Inner Its = 12 KKT violation = 6.217044e-04, nViolations = 0
Exiting because all subproblems reached KKT tol.
===========================================
Final f = 1.255984e+04
Final least squares fit = 5.232262e-01
Final KKT violation = 9.8838473e-05
Total inner iterations = 2416
Total execution time = 1.26 secs
factor_match_score_mu =
0.9745
Comparing Results
We can see that all methods can find a reasonable solution, though convergence speed and final accuracy might differ. The 'pqnr' and 'pdnr' methods are generally more sophisticated and may converge to a better solution or faster in terms of outer iterations, while 'mu' iterations are typically cheaper.
For this particular problem and random initialization:
fprintf('Factor Match Score (PQNR): %.4f\n', factor_match_score_pqnr); fprintf('Factor Match Score (PDNR): %.4f\n', factor_match_score_pdnr); fprintf('Factor Match Score (MU): %.4f\n', factor_match_score_mu);
Factor Match Score (PQNR): 0.9745 Factor Match Score (PDNR): 0.9745 Factor Match Score (MU): 0.9745
