UMAP (Uniform Manifold Approximation and Projection) is a novel manifold manifold learning technique for dimension reduction. The UMAP algorithm is competitive with t-SNE for visualization quality and arguably preserves more of the global structure and superior run time performance. UMAP has not computational restrictions on embedding dimensions making it viable as a general purpose dimension reduction technique for machine learning
Questions tagged [runumap]
38 questions
1
vote
1 answer
UMAP PicklingError: ("Can't pickle : ...)
Trying to run UMAP causes an error:
import pandas as pd
import numpy as np
import umap
df = pd.DataFrame(np.arange(25).reshape(-1,5))
um = umap.UMAP(random_state=0)
um.fit(df)
PicklingError: ("Can't pickle : it's…

user1717828
- 7,122
- 8
- 34
- 59
1
vote
1 answer
Using map function from purrr to test 2 parameters on one UMAP function in R
newbie need help again.I'm playing around a dataset with UMAP, a dimension reduction tool. Things like this will have 2 parameters that need to tune and look. Previously I have used tSNE, and it requires one parameter tuning. For tSNE, the parameter…

ML33M
- 341
- 2
- 19
0
votes
0 answers
Applying UMAP to a distance matrix
I am working on a clustering analysis and computed a distance matrix with a custom metric (it is the fusion of three differently weighted distance matrices) and I am trying to get components out of it using UMAP (I already tried MDS with successful…

Leonardo
- 1
- 1
0
votes
1 answer
How do I compute LSC scores ("stemness") onto a UMAP in Seurat?
In Seurat, I am working with a UMAP of blast cells and would like to use the LSC17 coefficients to produce a feature plot with "stemness" scores generated from the LSC17 onto the UMAP. I'm using this equation (Ng et al., 2016):
LSC17…

Jack Pep
- 3
- 2
0
votes
0 answers
HDBSCAN clusters sentence embeddings in one cluster that are way too far apart
I have the task to cluster utterances to a chatbot based on sentence similarity in order to find out which are topics users ask about and how important those topics are. I am converting the utterances into sentence embeddings using the…
0
votes
0 answers
Plotting an array as a point in 2D space in Python?
I have some lists with a large length and I want to plot them in 2D (like a scatter plot). The thing is, I need to maintain their topology / preserve their distance when I do this mapping.
If the distance(A,B) > distance(A,C), it should stay that…
0
votes
0 answers
Latest version of RAPIDS cuML in Kaggle notebooks
First of all, I am fairly new to running models on GPU, so sorry in advance for stupid questions.
I use RAPIDS cuML to GPU-accelerate some algorithms, but I noticed I cannot use the latest version (23.2.0) in a Kaggle notebook. When importing cuML,…

svaladou
- 1
0
votes
1 answer
Clustering text. Chatintets library Python. HBDSCAN, UMAP
I'm using chatintents (https://github.com/dborrelli/chat-intents) for automatically clustering. To embed sentences I use sentence transformers. The problem is when I set the maximum and minimum number of clusters and then run, the number of clusters…

Valentin Colella
- 1
- 2
0
votes
0 answers
Superimpose two UMAPs
I am starting out with python and I would appreciate some help.
I need to superimpose two UMPAs and give each a single different color in order to differenciate between them.
Does anyone know how to do this?
Thanks a lot in advance!

Maria Pereira
- 13
- 4
0
votes
0 answers
UMAP validation to calculate trustworthiness_vector problem
I have a dataset with over 200.000 data samples with 256 features, then, I used UMAP with n_components = 8, 16, 32, 64, to reduce data dimension fron 256 to 64, 32, 16, 8, respectively. I do not have labels. I want to use umap validation embedding…

Minh Vu
- 11
- 2
0
votes
0 answers
Error when loading UMAP. Cannot load set_parallel_chunksize
This is my code. I keep getting an error which tells me i cannot load set_parallel_chunksize:
import umap.umap_ as umap
from sklearn.mixture import GaussianMixture
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from…

Omar B
- 45
- 7
0
votes
1 answer
Best parameters for UMAP + HistGradientBoostingClassifier
I'm trying to find the best parameters for the UMAP (dimensionality reduction) model together with HistGradientBoostingClassifier.
The loop I have created is:
vectorizer = TfidfVectorizer(use_idf=True, max_features = 6000)
corpus =…

Maite89
- 273
- 2
- 8
0
votes
0 answers
Metric is neither callable, nor a recognised string issue
i am using UMAP method from the following link :metrix in UMAP
here is code fragment :
import umap
embedding = umap.UMAP(n_components=2, metric='hellinger').fit(word_doc_matrix)
where word_doc_matrix is calculated using CountVectorizer…

Machine_Learning
- 35
- 5
0
votes
0 answers
embedding projector visualisation- Loading of excel file
I have an excel file which needs to be parsed into the embedding projector(tensor board visual). How can I do that? I am new to this

shining_star
- 11
- 3
0
votes
0 answers
Approach to visualize images in UMAP
I have two sets of images, say S1 and S2 which are subset of their parent set S (S1 intersection S2 not necessarily equal to 0 and S1 union S2 not necessarily equal S). I want to visualize these sets of images in a UMAP. S1 and S2 are used as…

Prithviraj Kanaujia
- 331
- 4
- 15