Thoughts on algorithmic improvements? (comparison to latest Rspectra)

I was toying around with **RSpectra**, and damn, its truncated SVD got pretty good since you did your [comparison](https://bwlewis.github.io/irlba/comparison.html) a decade-ish ago.

```r
y <- Matrix::rsparsematrix(10000, 5000, density=0.1)

system.time(iout <- irlba::irlba(y, nv=20, nu=20))
##    user  system elapsed 
##   7.862   0.004   7.866 

system.time(sout <- RSpectra::svds(y, k=20))
##    user  system elapsed 
##   4.656   0.000   4.657 

str(iout)
## List of 5
##  $ d    : num [1:20] 53.9 53.8 53.8 53.8 53.7 ...
##  $ u    : num [1:10000, 1:20] -0.02029 0.00435 0.01116 -0.01035 -0.0011 ...
##  $ v    : num [1:5000, 1:20] -0.02178 -0.0053 -0.0031 -0.00887 0.00627 ...
##  $ iter : int 172
##  $ mprod: int 1200

str(sout)
## List of 5
##  $ d    : num [1:20] 53.9 53.8 53.8 53.8 53.7 ...
##  $ u    : num [1:10000, 1:20] -0.02029 0.00435 0.01116 -0.01035 -0.0011 ...
##  $ v    : num [1:5000, 1:20] -0.02178 -0.0053 -0.0031 -0.00887 0.00627 ...
##  $ niter: num 26
##  $ nops : num 834
```

Almost twice as fast, and pretty much the same results once you take out the indeterminate sign. AFAICT this isn't a quality-of-implementation issue as my [C++ port](https://github.com/libscran/irlba) is about the same speed as your original R package and it uses different code/libraries for every step (matrix multiplication, the internal SVD, etc.). So there is some algorithmic improvement in Spectra's `svds()` that causes it to converge much faster, as evidenced by the lower `niter` and `nops`.

Is there any scope for similar improvements to IRLBA? It's not a big deal, but I'd like to try to squeeze a bit more performance out of my C++ library if I can. I don't know enough math to have any good ideas here (my best guess would be to put a cap on the number of iterations in the Lancoz process to avoid a quadratic increase in the number of matrix-vector multiplications with increasing `nv` or `nu`) but if you know of any relevant advancements, I'd be happy to implement them and try them out.

<details>
<summary>Session information</summary>

```
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/trunk/lib/libRblas.so 
LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so;  LAPACK version 3.12.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Australia/Sydney
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.6.0  Matrix_1.7-5    tools_4.6.0     Rcpp_1.1.1     
[5] RSpectra_0.16-2 grid_4.6.0      irlba_2.3.7     lattice_0.22-9 
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on algorithmic improvements? (comparison to latest Rspectra) #79

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Thoughts on algorithmic improvements? (comparison to latest Rspectra) #79

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions