Questions tagged [sampling]

In signal processing, sampling is the reduction of a continuous signal to a discrete signal. In statistics, sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population.

This tag should be used for questions related to programming solutions related to sampling.

Sampling can be done for functions varying in space, time, or any other dimension, and similar results are obtained in two or more dimensions. More information in Wikipedia - sampling (signal processing).

For statistical sampling, see Wikipedia - sampling (statistics) for more.

1593 questions
4
votes
1 answer

Translate dplyr slice_sample function into base R

Just a quick question: how to translate the dplyr function slice_sample into base R? Here is a toy dataset: y <- rnorm(20) x <- rnorm(20) z <- rep(1:4, 5) w <- rep(1:5, each=4) dd <- data.frame(id=z,cluster=w,x=x,y=y) Then I use slice_sample to…
cliu
  • 933
  • 6
  • 13
4
votes
3 answers

SMOTE - could not convert string to float

I think I'm missing something in the code below. from sklearn.model_selection import train_test_split from imblearn.over_sampling import SMOTE # Split into training and test sets # Testing Count Vectorizer X = df[['Spam']] y =…
Math
  • 191
  • 2
  • 5
  • 19
4
votes
1 answer

How to sample points from a data set using a grid?

So I have some data with around a million (r, phi) coordinates, along with their intensities. I want to sample this data in a grid pattern so I can reduce memory used, and plot faster. However I want to sample the data in X,Y as I will be converting…
4
votes
2 answers

recursive sampling in r

I´m trying to simulate death over 7 years with the cumulative probability as follows: tab <- data.frame(id=1:1000,char=rnorm(1000,7,4)) cum.prob <- c(0.05,0.07,0.08,0.09,0.1,0.11,0.12) How can I sample from tab$id without replacement in a…
Misha
  • 3,114
  • 8
  • 39
  • 60
4
votes
2 answers

Simplest way to capture raw audio from audio input for real time processing on a mac

What is the simplest way to capture audio from the built in audio input and be able to read the raw sampled values (as in a .wav) in real time as they come in when requested, like reading from a socket. Hopefully code that uses one of Apple's…
user497804
  • 292
  • 2
  • 7
4
votes
1 answer

Create numeric samples based on multiple conditions of multiple vectors

Given the following data frame: df <- tibble::tribble( ~pass_id, ~km_ini, ~km_fin, 1L, 0.89, 2.39, 2L, 1.53, 3.03, 3L, 21.9, 23.4, 4L, 23.4, 24.9, 5L, 24, 25.5, 6L, …
rdornas
  • 630
  • 7
  • 15
4
votes
0 answers

Fast geospatial sampling in R

I have a large set of polygons (about 20k) that I want to sample points from. I use the st_sample function from the sf package in R, but it's pretty slow. It takes about 5 minutes to sample from all polygons, and I need to repeat this task a large…
Ben
  • 429
  • 4
  • 11
4
votes
4 answers

How to downsample a signal preserving spikes?

I'm analyzing a signal sampled at 200Hz for 6-8 seconds, and the important part are the spikes, that lasts 1 second at max. Think for example to an earthquake... I have to downsample the signal by a factor 2. I tried: from scipy import…
Marco Sulla
  • 15,299
  • 14
  • 65
  • 100
4
votes
2 answers

Initialization of Weighted Reservoir Sampling (A-Chao implementation)

I am trying to implement A-Chao version of weighted reservoir sampling as shown in https://en.wikipedia.org/wiki/Reservoir_sampling#Algorithm_A-Chao But I found that the pseudo-code described in wiki seems to be wrong, especially on the…
zzz
  • 975
  • 8
  • 14
4
votes
3 answers

When do feature selection in imblearn pipeline with cross-validation and grid search

Currently I am building a classifier with heavily imbalanced data. I am using the imblearn pipeline to first to StandardScaling, SMOTE, and then the classification with gridSearchCV. This ensures that the upsampling is done during the…
Joost Jansen
  • 61
  • 1
  • 3
4
votes
1 answer

How does Numpy sample random numbers from a non-uniform distribution?

I have been learning about random sampling methods and am aware that Numpy uses Mersenne-Twister to generate uniform random numbers, how does it then pass these to generate non-uniform distributions? For example: np.random.normal(mu,sigma,n) What…
j-a-maths
  • 45
  • 3
4
votes
1 answer

Android sampling rates variation of hardware Sensors on Nexus 6P

I'm developing an Android app, for a research, and im reading several Sensor data like accelerometer, gyroscope, barometer etc. So I have 4 Nexus 6P devices all with the newest Factory Image and freshly set up with no other app installed than the…
Timm L
  • 43
  • 3
4
votes
1 answer

Weighted sampling without replacement using gonum

I have a big array of items and another array of weights of the same size. I would like to sample without replacement from the first array based on the weights from the second array. Is there a way to do this using gonum?
alpaca
  • 1,211
  • 13
  • 23
4
votes
1 answer

Tensorflow: Efficient multinomial sampling (Theano x50 faster?)

I want to be able to sample from a multinomial distribution very efficiently and apparently my TensorFlow code is very... very slow... The idea is that, I have: A vector: counts = [40, 50, 26, ..., 19] for example A matrix of probabilities: probs =…
priseJack
  • 396
  • 2
  • 14
4
votes
0 answers

Negative Sampling in Tensorflow without sampled_softmax_loss function

Is there a function that allow me to do negative sampling without using sampled_softmax_loss ( Tensorflow negative sampling) I am looking for negative sampling method that takes frequency of a label in training data into training account, and…
Long Le Minh
  • 335
  • 1
  • 2
  • 12