How to efficiently construct a large SparseArray? Packages for this?

Question

Problem

Does Julia have an efficient way of constructing a huge sparse matrix from a given list of entries (u,v,w), some of which can have the same locations (u,v), and in that case, their weights w must be summed. Thus u,v,w are input vectors and I wish to create a sparse matrix that has value w[i] at position u[i],v[i]. For instance, the Mathematica code

n=10^6;   m=500*n;   
u=RandomInteger[{1,n},m];   
v=RandomInteger[{1,n},m];   
w=RandomInteger[{-9, 9}, m];   AbsoluteTiming[
SetSystemOptions["SparseArrayOptions"->{"TreatRepeatedEntries"->1}]; 
a= SparseArray[{u,v}\[Transpose] -> w, {n,n}]; ]

requires 135sec and 60GB of RAM. The equivalent Python code

import scipy.sparse as sp   
import numpy as np
import time
def ti(): return time.perf_counter() 
n=10**6; m=500*n;
u=np.random.randint(0,n,size=m);            
v=np.random.randint(0,n,size=m);            
w=np.random.randint(-9,10,size=m); t0=ti(); 
a=sp.csr_matrix((w,(u,v)),shape=(n,n),dtype=int); t1=ti(); print(t1-t0)

needs 36sec and 20GB, but doesn't support (2). The equivalent Julia code

using SparseArrays;
m=n=10^6; r=500*n;   
u=rand(1:m,r);   
v=rand(1:n,r);   
w=rand(-9:9,r);
function f() a=sparse(u,v,w,m,n,+); end;   
@time f()

needs 155sec and 26GB (the @time macro incorrectly reports that it used only 15GB).

Is there a way to make this construction more efficient? Are there any Julia packages that are better at this?

Background

I have created a package for linear algebra and homological algebra computations. I did it in Mathematica, since the SparseArray implementation (0) there is

very efficient (fast and low RAM usage),
supports exact fractions as matrix entries.
supports polynomials as matrix entries.

However, it is not

parallelized
open source
GPU or cluster enabled.

For various long-term reasons, I am considering migrating to a different programming language.

I have tried Python SciPy.sparse, but it doesn't satisfy (2). I'm not sure if C++ libraries support (2). As a test, I tried sorting an array of floats of size 10^9 and compared the performance of Mathematica, Python numpy, and Julia ThreadsX. I was incredibly impressed by the latter (3x faster than numpy, 10x than Mathematica). I am considering migrating to Julia, but I wish to first make sure that my library would perform better than in Mathematica.

P.S. How can I achieve that my f() above doesn't print/output anything when called?

P.P.S. I also asked this question here.

For your side question: `function f() a = whatever; return nothing; end;`, though I'm not sure why you'd ever want that; there is no significant overhead to returning `a` under Julia's pass-by-sharing. The `return` statement is also technically optional (every statement is an expression in Julia, and a function returns the result of the last expression if not otherwise specified), though most style guides reccommend explicit `return`s — cbk, Jul 17 '21 at 20:23
@Leo, I would suggest you drop a message on discourse with more specific details. I would be surprised if you couldn't reduce the allocations, but without details it's impossible to tell where the bottlenecks are. — PatrickT, Jul 17 '21 at 21:43
@PatrickT thank you, I now see it was a bit unclear what I was trying to achieve. I added a few lines and clarified the problem. — Leo, Jul 17 '21 at 21:55

cbk · Answer 1 · 2021-07-17T22:08:47.360

In this particular case, it looks like you may be able to get what you are after more simply with (e.g.)

julia> @time a = sprand(Float64, 10^6, 10^6, 500/10^6)
 17.880987 seconds (7 allocations: 7.459 GiB, 7.71% gc time)
1000000×1000000 SparseMatrixCSC{Float64, Int64} with 499998416 stored entries:
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛⠛

More generally, you may want to check out the underlying method SparseArrays.sparse!, which will allow more efficient in-place construction.

help?> SparseArrays.sparse!
  sparse!(I::AbstractVector{Ti}, J::AbstractVector{Ti}, V::AbstractVector{Tv},
          m::Integer, n::Integer, combine, klasttouch::Vector{Ti},
          csrrowptr::Vector{Ti}, csrcolval::Vector{Ti}, csrnzval::Vector{Tv},
          [csccolptr::Vector{Ti}], [cscrowval::Vector{Ti}, cscnzval::Vector{Tv}] ) where {Tv,Ti<:Integer}

  Parent of and expert driver for sparse; see sparse for basic usage. This method allows
  the user to provide preallocated storage for sparse's intermediate objects and result
  as described below. This capability enables more efficient successive construction of
  SparseMatrixCSCs from coordinate representations, and also enables extraction of an
  unsorted-column representation of the result's transpose at no additional cost.

  This method consists of three major steps: (1) Counting-sort the provided coordinate
  representation into an unsorted-row CSR form including repeated entries. (2) Sweep
  through the CSR form, simultaneously calculating the desired CSC form's column-pointer
  array, detecting repeated entries, and repacking the CSR form with repeated entries
  combined; this stage yields an unsorted-row CSR form with no repeated entries. (3)
  Counting-sort the preceding CSR form into a fully-sorted CSC form with no repeated
  entries.

  Input arrays csrrowptr, csrcolval, and csrnzval constitute storage for the
  intermediate CSR forms and require length(csrrowptr) >= m + 1, length(csrcolval) >=
  length(I), and length(csrnzval >= length(I)). Input array klasttouch, workspace for
  the second stage, requires length(klasttouch) >= n. Optional input arrays csccolptr,
  cscrowval, and cscnzval constitute storage for the returned CSC form S. csccolptr
  requires length(csccolptr) >= n + 1. If necessary, cscrowval and cscnzval are
  automatically resized to satisfy length(cscrowval) >= nnz(S) and length(cscnzval) >=
  nnz(S); hence, if nnz(S) is unknown at the outset, passing in empty vectors of the
  appropriate type (Vector{Ti}() and Vector{Tv}() respectively) suffices, or calling the
  sparse! method neglecting cscrowval and cscnzval.

  On return, csrrowptr, csrcolval, and csrnzval contain an unsorted-column
  representation of the result's transpose.

  You may reuse the input arrays' storage (I, J, V) for the output arrays (csccolptr,
  cscrowval, cscnzval). For example, you may call sparse!(I, J, V, csrrowptr, csrcolval,
  csrnzval, I, J, V).

  For the sake of efficiency, this method performs no argument checking beyond 1 <= I[k]
  <= m and 1 <= J[k] <= n. Use with care. Testing with --check-bounds=yes is wise.

  This method runs in O(m, n, length(I)) time. The HALFPERM algorithm described in F.
  Gustavson, "Two fast algorithms for sparse matrices: multiplication and permuted
  transposition," ACM TOMS 4(3), 250-269 (1978) inspired this method's use of a pair of
  counting sorts.

If you need something even faster than this, in principle you can construct a sparse matrix with virtually no overhead by using the SparseMatrixCSC constructor directly (though you have to know what you're doing and what the fields mean). As some of the early contributors to what is now the SparseArrays stdlib (at the time part of Base) noted

I consider using the SparseMatrixCSC constructor directly to be the I-know-what-I'm-doing interface. People use it to skip the overhead and checking of sparse, for example to create matrices with non-sorted rows, or explicit zeros, things like that. I'd be fine with making show do some validity checks on the assumptions it makes before trying to show the matrix, but having a lenient constructor as a direct way of saying "trust me, wrap this data in a SparseMatrixCSC" is a valuable thing.

If you want, you could certainly write parallelized functions (using ThreadsX.jl or the threaded version of LoopVectorization.jl or whatever else) to construct the underlying arrays directly in parallel, then wrap them with SparseMatrixCSC

For all the details on how to construct a SparseMatrixCSC directly, you might check out https://github.com/JuliaLang/julia/blob/master/stdlib/SparseArrays/src/sparsematrix.jl

Edit: to give an illustrative example (though see warnings below)

using SparseArrays
m = n = 10^6
colptr = collect(1:500:n*500+1)
rowval = repeat(1:2000:m, n)
nzval = rand(Float64, n*500)
@time a = SparseMatrixCSC(m,n,colptr,rowval,nzval)

julia> @time a = SparseMatrixCSC(m,n,colptr,rowval,nzval)
  0.012364 seconds (1 allocation: 48 bytes)
1000000×1000000 SparseMatrixCSC

{Float64, Int64} with 500000000 stored entries:
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿

julia> a[1,1]
0.9356478932120766

julia> a[2,1]
0.0

julia> a[2001,1]
0.2121877970335444

julia> a[:,1]
1000000-element SparseVector{Float64, Int64} with 500 stored entries:
  [1      ]  =  0.935648
  [2001   ]  =  0.212188
  [4001   ]  =  0.429638
  [6001   ]  =  0.0190535
  [8001   ]  =  0.0878085
  [10001  ]  =  0.24005
  [12001  ]  =  0.785151
  [14001  ]  =  0.348142
             ⋮
  [982001 ]  =  0.637904
  [984001 ]  =  0.136397
  [986001 ]  =  0.483078
  [988001 ]  =  0.862434
  [990001 ]  =  0.703863
  [992001 ]  =  0.00990588
  [994001 ]  =  0.629455
  [996001 ]  =  0.123507
  [998001 ]  =  0.411807

Two major caveats:

There is no checking that rows are sorted in this example. If you don't know what that means or why that matters, this sort of low-level approach is not for you.
This provides no checking against collisions. If you already know from the source of your data that there are no collisions, that is fine. For random matrices, it may or may not be fine statistically depending on level of sparsity. But again, if you don't know that that means, stay away.

`sprand` is of no use to my problem, since I have specific entries that I must encode. So is there currently no Julia code for sparse array construction that could rival `scipy.sparse`? — Leo, Jul 17 '21 at 20:56
Did you read the rest of the comment? I will edit to give an example of how to use the SparseMatrixCSC constructor directly, if that was not clear enough -- though how to use it for your problem would depend on the actual structure of your data. The other option is to use `sparse!` instead of `sparse`, though if you can use SparseMatrixCSC directly that will be dramatically faster. — cbk, Jul 17 '21 at 21:32
`sparse!` will operate in-place -- not on `u`,`v`,`w`, but on `csrrowptr`, `csrcolval`, `csrnzval`,`csccolptr`, `cscrowval`, and `cscnzval`. If you wish to reuse u,v,w, though, then as the documentation I copied notes "You may reuse the input arrays' storage (I, J, V) for the output arrays (csccolptr, cscrowval, cscnzval)" — cbk, Jul 17 '21 at 21:54
Dear cbk, I suspect my problem was unclear. I only have `u,v,w` as input and wish to create a sparse matrix which has value `w[i]` at position `u[i],v[i]` (some positions can be repeated, such entries must be summed). My `u,v,w` can get destroyed in the process, as long as I get the correct matrix. I do not have row pointers and columns indices, though (that's what should be computed). — Leo, Jul 17 '21 at 22:00
That's fine with `sparse!` -- you just provide those arrays as buffers. — cbk, Jul 17 '21 at 22:04
Yeah, that is explained in the documentation I pasted for `sparse!` in the answer above. If you're having trouble figuring out what that means, we can continue this conversation on Julia Slack, Zulip, or Discourse. Additionally, there will be folks there with far more sparse matrix experience than I. — cbk, Jul 17 '21 at 22:12
I still don't know how to use sparse! with only `u,v,w,m,n` as input, would you be so kind and give me concrete code? I don't think your answer addresses my question. Also, I suggest we delete several above comments to free up space, and if we still run out, I will open this question at discourse.julialang.org. — Leo, Jul 18 '21 at 11:56

How to efficiently construct a large SparseArray? Packages for this?

Problem

Background

1 Answers1

Linked