Questions tagged [sampling]

In signal processing, sampling is the reduction of a continuous signal to a discrete signal. In statistics, sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population.

This tag should be used for questions related to programming solutions related to sampling.

Sampling can be done for functions varying in space, time, or any other dimension, and similar results are obtained in two or more dimensions. More information in Wikipedia - sampling (signal processing).

For statistical sampling, see Wikipedia - sampling (statistics) for more.

1593 questions
7
votes
2 answers

Why set.seed() affects sample() in R

I always thought set.seed() only makes random variable generators (e.g., rnorm) to generate a unique sequence for any specific set of input values. However, I'm wondering, why when we set the set.seed(), then the function sample() doesn't do its job…
rnorouzian
  • 7,397
  • 5
  • 27
  • 72
7
votes
1 answer

How to draw N random samples from a vector in R?

I have a vector with 663 elements. I would like to create random samples from the vector equal to the length of the vector (i.e. 663). Said differently, I would like to take random samples from all possible orderings of the 663 elements. My goal is…
RTrain3k
  • 845
  • 1
  • 13
  • 27
7
votes
1 answer

How to get a random (bootstrap) sample from pandas multiindex

I'm trying to create a bootstrapped sample from a multiindex dataframe in Pandas. Below is some code to generate the kind of data I need. from itertools import product import pandas as pd import numpy as np df = pd.DataFrame({'group1': [1, 1, 1,…
Chris
  • 676
  • 5
  • 20
7
votes
1 answer

How to repeat 1000 times this random walk simulation in R?

I'm simulating a one-dimensional and symmetric random walk procedure: y[t] = y[t-1] + epsilon[t] where white noise is denoted by epsilon[t] ~ N(0,1) in time period t. There is no drift in this procedure. Also, RW is symmetric, because Pr(y[i] =…
Übel Yildmar
  • 491
  • 1
  • 9
  • 24
7
votes
3 answers

Stratified sampling on factor

I have a dataset of 1000 rows with the following structure: device geslacht leeftijd type1 type2 1 mob 0 53 C 3 2 tab 1 64 G 7 3 pc 1 50 G 7 4 tab 0…
karmabob
  • 105
  • 1
  • 6
7
votes
5 answers

Convert SoundFont to .wav files for each note

Is there a simple way of converting a sound font file to .wav files (or any common music format, really), separate for each note? So let's say I had a sound font a.sfz; I would like to get out of it a list of files A0.wav, A#0.wav, B0.wav, C1.wav,…
houbysoft
  • 32,532
  • 24
  • 103
  • 156
6
votes
2 answers

easy sampling of vectors from a sparse matrix, and creating a new matrix from the sample (python)

This question has two parts (maybe one solution?): Sample vectors from a sparse matrix: Is there an easy way to sample vectors from a sparse matrix? When I'm trying to sample lines using random.sample I get an TypeError: sparse matrix length is…
ScienceFriction
  • 1,538
  • 2
  • 18
  • 29
6
votes
2 answers

Randomly selecting values from an existing matrix after adding a vector (in R)

Thank you so much for your help in advance! I am trying to modify an existing matrix such that, when a new line is added to the matrix, it removes values from the preexisting matrix. For example, I have the matrix: [,1] [,2] [,3] [,4] 1 1 0 …
Laura
  • 679
  • 2
  • 5
  • 14
6
votes
1 answer

Pandas: Sampling from a DataFrame according to a target distribution

I have a Pandas DataFrame containing a dataset D of instances which all have some continuous value x. x is distributed in a certain way, say uniform, could be anything. I want to draw n samples from D for which x has a target distribution that I can…
meow
  • 925
  • 7
  • 22
6
votes
2 answers

Create a random subsample by ID and with a certain factor distribution in R

I am working with R and have the following dataset which consists of sentences taken out of books and contains data about the book id, their cover colour (colour), and a sentence ID which is matched with the corresponding book. My dataset Book…
lole_emily
  • 95
  • 9
6
votes
1 answer

readframes return 2 byte in python

When readframes() is used in python, the online documention says sampling frequency is returned it looks it returns 2 bytes. I think there are 4 byte on each frame: left = 2 bytes right = 2 bytes Do I have to check if it is mono or stereo and if it…
kim taeyun
  • 1,837
  • 2
  • 24
  • 49
6
votes
1 answer

Comparison of two vectors resulted after simulation

I would like to apply the Rejection sampling method to simulate a random vector Y=(Y_1, Y_2) of a uniform distribution from a unit disc D = { (X_1 , X_2) \in R^2: \sqrt{x^2_1 + x^2_2} ≤ 1} such that X = (X_1 , X_ 2) is random vector of a uniform…
6
votes
1 answer

Is there a fast way to sample from a subset of GLn?

The rules of this problem are fairly specific because I'm actually looking at a subset of GLn, where the row and column vectors must have a certain form (call these vectors valid -- examples below), so please bear with me. Here are the rules: You…
PengOne
  • 48,188
  • 17
  • 130
  • 149
6
votes
5 answers

Bayesian network in Python: both construction and sampling

For a project, I need to create synthetic categorical data containing specific dependencies between the attributes. This can be done by sampling from a pre-defined Bayesian Network. After some exploration on the internet, I found that Pomegranate is…
Rutger Mauritz
  • 153
  • 1
  • 12
6
votes
1 answer

Efficient algorithm for generating unique (non-repeating) random numbers

I want to solve the following problem. I have to sample among an extremely large set, of the order of 10^20 and extracting a sample without repetitions of size about 10%-20%. Given the size of the set, I believe that an algorithm like Fisher–Yates…