Questions tagged [runumap]

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold manifold learning technique for dimension reduction. The UMAP algorithm is competitive with t-SNE for visualization quality and arguably preserves more of the global structure and superior run time performance. UMAP has not computational restrictions on embedding dimensions making it viable as a general purpose dimension reduction technique for machine learning

38 questions
1
vote
1 answer

UMAP PicklingError: ("Can't pickle : ...)

Trying to run UMAP causes an error: import pandas as pd import numpy as np import umap df = pd.DataFrame(np.arange(25).reshape(-1,5)) um = umap.UMAP(random_state=0) um.fit(df) PicklingError: ("Can't pickle : it's…
user1717828
  • 7,122
  • 8
  • 34
  • 59
1
vote
1 answer

Using map function from purrr to test 2 parameters on one UMAP function in R

newbie need help again.I'm playing around a dataset with UMAP, a dimension reduction tool. Things like this will have 2 parameters that need to tune and look. Previously I have used tSNE, and it requires one parameter tuning. For tSNE, the parameter…
ML33M
  • 341
  • 2
  • 19
0
votes
0 answers

Applying UMAP to a distance matrix

I am working on a clustering analysis and computed a distance matrix with a custom metric (it is the fusion of three differently weighted distance matrices) and I am trying to get components out of it using UMAP (I already tried MDS with successful…
Leonardo
  • 1
  • 1
0
votes
1 answer

How do I compute LSC scores ("stemness") onto a UMAP in Seurat?

In Seurat, I am working with a UMAP of blast cells and would like to use the LSC17 coefficients to produce a feature plot with "stemness" scores generated from the LSC17 onto the UMAP. I'm using this equation (Ng et al., 2016): LSC17…
Jack Pep
  • 3
  • 2
0
votes
0 answers

HDBSCAN clusters sentence embeddings in one cluster that are way too far apart

I have the task to cluster utterances to a chatbot based on sentence similarity in order to find out which are topics users ask about and how important those topics are. I am converting the utterances into sentence embeddings using the…
0
votes
0 answers

Plotting an array as a point in 2D space in Python?

I have some lists with a large length and I want to plot them in 2D (like a scatter plot). The thing is, I need to maintain their topology / preserve their distance when I do this mapping. If the distance(A,B) > distance(A,C), it should stay that…
0
votes
0 answers

Latest version of RAPIDS cuML in Kaggle notebooks

First of all, I am fairly new to running models on GPU, so sorry in advance for stupid questions. I use RAPIDS cuML to GPU-accelerate some algorithms, but I noticed I cannot use the latest version (23.2.0) in a Kaggle notebook. When importing cuML,…
0
votes
1 answer

Clustering text. Chatintets library Python. HBDSCAN, UMAP

I'm using chatintents (https://github.com/dborrelli/chat-intents) for automatically clustering. To embed sentences I use sentence transformers. The problem is when I set the maximum and minimum number of clusters and then run, the number of clusters…
0
votes
0 answers

Superimpose two UMAPs

I am starting out with python and I would appreciate some help. I need to superimpose two UMPAs and give each a single different color in order to differenciate between them. Does anyone know how to do this? Thanks a lot in advance!
0
votes
0 answers

UMAP validation to calculate trustworthiness_vector problem

I have a dataset with over 200.000 data samples with 256 features, then, I used UMAP with n_components = 8, 16, 32, 64, to reduce data dimension fron 256 to 64, 32, 16, 8, respectively. I do not have labels. I want to use umap validation embedding…
Minh Vu
  • 11
  • 2
0
votes
0 answers

Error when loading UMAP. Cannot load set_parallel_chunksize

This is my code. I keep getting an error which tells me i cannot load set_parallel_chunksize: import umap.umap_ as umap from sklearn.mixture import GaussianMixture from sklearn.cluster import KMeans from sklearn.metrics import silhouette_score from…
Omar B
  • 45
  • 7
0
votes
1 answer

Best parameters for UMAP + HistGradientBoostingClassifier

I'm trying to find the best parameters for the UMAP (dimensionality reduction) model together with HistGradientBoostingClassifier. The loop I have created is: vectorizer = TfidfVectorizer(use_idf=True, max_features = 6000) corpus =…
Maite89
  • 273
  • 2
  • 8
0
votes
0 answers

Metric is neither callable, nor a recognised string issue

i am using UMAP method from the following link :metrix in UMAP here is code fragment : import umap embedding = umap.UMAP(n_components=2, metric='hellinger').fit(word_doc_matrix) where word_doc_matrix is calculated using CountVectorizer…
0
votes
0 answers

embedding projector visualisation- Loading of excel file

I have an excel file which needs to be parsed into the embedding projector(tensor board visual). How can I do that? I am new to this
0
votes
0 answers

Approach to visualize images in UMAP

I have two sets of images, say S1 and S2 which are subset of their parent set S (S1 intersection S2 not necessarily equal to 0 and S1 union S2 not necessarily equal S). I want to visualize these sets of images in a UMAP. S1 and S2 are used as…