0

I am working on a clustering analysis and computed a distance matrix with a custom metric (it is the fusion of three differently weighted distance matrices) and I am trying to get components out of it using UMAP (I already tried MDS with successful results, but I might as well try it with UMAP too).

So, I created this distance matrix using the dist() command, and I also converted it into a matrix. I tried using the umap() command from the uwot package, with the following results:

UMAP_prep <- umap(Futbol_Sparse, metric = "precomputed", n_components = 5)
Error in matrix(0, nrow = n, ncol = k) : 
  invalid 'ncol' value (too large or NA)

and

> UMAP_prep <- umap(Futbol_Distances, metric = "precomputed", n_components = 5)
Error in 1:k : argument of length 0

I am aware of the fact that I could apply UMAP to my raw dataset (which I cannot provide since it contains 8901 observations with 67 predictors) so any ideas on how I could apply UMAP to my distance matrix?

Thanks in advance.

EDIT: here is an extract of the data frame:

> a
6 x 6 sparse Matrix of class "dsCMatrix"
          1         2         3         4         5         6
1 .         0.1125300 0.2593345 0.3366033 0.1128020 0.3617233
2 0.1125300 .         0.2304761 0.1847940 0.2635693 0.4567474
3 0.2593345 0.2304761 .         0.1489901 0.2106683 0.4101453
4 0.3366033 0.1847940 0.1489901 .         0.1494022 0.1547576
5 0.1128020 0.2635693 0.2106683 0.1494022 .         0.4835147
6 0.3617233 0.4567474 0.4101453 0.1547576 0.4835147 .        

> str(Futbol_Sparse)
Formal class 'dsCMatrix' [package "Matrix"] with 7 slots
  ..@ i       : int [1:39609450] 0 0 1 0 1 2 0 1 2 3 ...
  ..@ p       : int [1:8902] 0 0 1 3 6 10 15 21 28 36 ...
  ..@ Dim     : int [1:2] 8901 8901
  ..@ Dimnames:List of 2
  .. ..$ : chr [1:8901] "1" "2" "3" "4" ...
  .. ..$ : chr [1:8901] "1" "2" "3" "4" ...
  ..@ x       : num [1:39609450] 0.113 0.259 0.23 0.337 0.185 ...
  ..@ uplo    : chr "U"
  ..@ factors : list()

Leonardo
  • 1
  • 1
  • Please share an example of your input data by pasting the output of `dput(Futbol_Sparse[1:6, 1:6])` and `str(Futbol_Sparse)`; here's why: https://stackoverflow.com/help/minimal-reproducible-example – I_O Aug 23 '23 at 12:42
  • @I_O I added the request output. Many thanks! – Leonardo Aug 23 '23 at 13:48
  • Thank you! The input values are indeed numeric (`int`), and `umap` does accept spare matrices, so unfortunately I'm at a loss about the error source. You might share the output of `dput(x)` (where x is a small subset of your matrix) though, because this allows others to easily generate a replication of your R object. – I_O Aug 23 '23 at 14:59
  • Weirdly enough, deleting the "method = 'precomputed' " part does the trick for the sparse Matrix. But if I use the precomputed method, I get the following error `Error in matrix(0, nrow = n, ncol = k) : invalid 'ncol' value (too large or NA) `. So I am at a loss for what to do, since there is no documentation for reducing dimensionality of a distance matrix with different distances using UMAP. – Leonardo Aug 23 '23 at 15:45

0 Answers0