What do exactly rad_orig
and r_emb
represent in umap.UMAP
? (docs).
These parameters are available when output_dens
flag is set.
Reading the docs:
r_orig: array, shape (n_samples)
Local radii of data points in the original data space (log-transformed)
and
r_emb: array, shape (n_samples)
Local radii of data points in the embedding (log-transformed).
I tried to figure that out with an example, by having a toy array of integers with shape (100,50)
:
array([[15, 8, 16, ..., 12, 9, 14],
[ 4, 4, 5, ..., 4, 19, 15],
[ 2, 4, 16, ..., 4, 7, 8],
...,
[11, 17, 14, ..., 7, 18, 6],
[ 2, 16, 12, ..., 18, 17, 15],
[ 3, 11, 9, ..., 11, 14, 8]])
and executing
umap_trans = umap.UMAP(densmap=True,output_dens=True).fit(arr)
then r_orig
looks something like
array([7.557382 , 7.6884522, 7.5413175, 7.5586753, 7.526751 , 7.633186 ,
7.579795 , 7.6138983, 7.4713755, 7.5365367, 7.63102 , 7.627236 ,
7.5586395, 7.5616612, 7.5946164, 7.626307 , 7.6850867, 7.5265946,
7.5604353, 7.5958605, 7.5464926, 7.5515323, 7.6224527, 7.5082755,
7.6015797, 7.5680337, 7.6188903, 7.5625224, 7.6245193, 7.5826597,
7.6149483, 7.5915165, 7.558839 , 7.613548 , 7.578578 , 7.613815 ,
7.684106 , 7.5169396, 7.5644665, 7.6615157, 7.6193194, 7.626235 ,
7.656492 , 7.58103 , 7.5389533, 7.641165 , 7.588751 , 7.554403 ,
7.647078 , 7.6455092, 7.561126 , 7.5732226, 7.6015496, 7.6265235,
7.564877 , 7.5956354, 7.6075587, 7.5987916, 7.626135 , 7.539194 ,
7.5905514, 7.6090746, 7.6593614, 7.6186256, 7.66446 , 7.5629582,
7.6118226, 7.54342 , 7.5881543, 7.563827 , 7.60424 , 7.6116834,
7.5791817, 7.5829387, 7.6135163, 7.562068 , 7.7188945, 7.5859914,
7.6612687, 7.5608892, 7.5465975, 7.5277977, 7.6697884, 7.5451746,
7.5410295, 7.5975976, 7.588921 , 7.6266494, 7.630443 , 7.621092 ,
7.5729136, 7.559135 , 7.665758 , 7.585926 , 7.7076025, 7.4915547,
7.6049953, 7.5991044, 7.637067 , 7.5531616], dtype=float32)
how are these numbers related to the original array? I couldn't find any further mathematical explanation on this.