Questions tagged [statistical-sampling]
33 questions
1
vote
1 answer
Sampling on a aggregated dataset
Input is a dataset where every row contains for an event, say click. The member ID is a unique ID.
sample data:
M1,100
M2,100
M3,50
M4,50
The goal is to sample 1% of the clicks, where total clicks are given by summing up all clicks across all…

Duckling
- 923
- 7
- 12
1
vote
4 answers
recognize the levels of 1D data by only knowing the number of levels
I have a sensor that output data consist of one attribute (mono value). An example of punch of sequenced data is as…

asker
- 49
- 1
- 7
1
vote
1 answer
why divide sample standard deviation by sqrt(sample size) when calculating z-score
I have been following Khan Academy videos to gain understanding of hypothesis testing, and I must confess that all my understanding thus far is based on that source.
Now, the following videos talk about z-score/hypothesis testing:
Hypothesis…

rj dj
- 260
- 1
- 5
- 22
1
vote
1 answer
SAS - proc cusum not found
I have the following SAS code:
data vis;
input v;
datalines;
3169
3173
3162
3154
3139
3145
3160
3172
3175
3205
3203
3209
3208
3211
3214
3215
3209
3203
3185
3187
3192
3199
3197
3193
3190
3183
3197
3188
3183
3175
3174
3171
3180
3179
3175
3174
;
proc…

Stoner
- 846
- 1
- 10
- 30
1
vote
1 answer
Music genre classification with sklearn: how to accurately evaluate different models
I'm working on a project to classify 30 second samples of audio from 5 different genres (rock, electronic, rap, country, jazz). My dataset consists of 600 songs, exactly 120 for each genre. The features are a 1D array of 13 mfccs for each song and…

ohbrobig
- 939
- 2
- 13
- 34
1
vote
2 answers
How to remove a percentage from a dataset in Weka but keep the class balance?
I have a data set with 50% instances from class A and 50% instances of class B. I want to split my data set into a training set and a test set. I know the RemovePercentage filter exists but it doesn't care about the class balance. How do I remove…

Stanko
- 4,275
- 3
- 23
- 51
0
votes
0 answers
Error associated with using NumPyro to create a linear regression model
I'm using Numpyro to create a simple linear regression model consisting of two variables, the aim is to obtain a similar graph to https://num.pyro.ai/en/latest/tutorials/bayesian_regression.html (3rd graph).
I have used numpyro to generate 2000…
0
votes
1 answer
(R) Finding proportion of population defectives at probability 0.1 acceptance
I'm using the following R code:
library(AcceptanceSampling)
x <- OC2c(50, 2, type="hypergeom", N=4000)
plot(x, xlim=c(0,0.2))
which generates the plot:
I will like to find the proportion when P(accept) (Y-axis) is 0.1. Is there a way to do this…

Stoner
- 846
- 1
- 10
- 30
0
votes
0 answers
Is there a way to handle "cannot allocate vector of size" issue without dropping data?
Unlike a previous question about this, this case is different to that and that is why I'm asking. I have an already cleaned dataset containing 120 000 observations of 25 variables, and I am supposed to analyze it all through logistic regression and…

Aite97
- 155
- 1
- 9
0
votes
0 answers
What do you do if the sample size for an A/B test is larger than the population?
I have a list of 7337 customers (selected because they only had one booking from March-August 2018). We are going to contact them and are trying to test the impact of these activities on their sales. The idea is that contacting them will cause them…

datababie
- 1
- 3
0
votes
1 answer
Generate n samples, Rejection sampling in R
Rejection Sampling
Im working with rejection sampling with a truncated normal distribution, see r code below. How can I make the sampling stop at a specific n? for example 1000 observations.
I.e. I want to stop the sampling when the number of…

Hans Christensen
- 39
- 8
0
votes
2 answers
What's wrong with this simple method to sample from multinomial in C#?
I wanted to implement a simple method to sample from a multinomial distribution in C# (the first argument is an array of integers we want to sample and the second one is the probabilities of selecting each of those integers).
When I do this with…

Rohit Pandey
- 2,443
- 7
- 31
- 54
0
votes
1 answer
Fit a line to small multiples
I want to fit a line that goes through the mean of sampling distributions on a shared plot. This code creates a similar data set to the one I am using. It creates a sampling distribution and plots the distributions on the same graphs. Then, I draw a…

Jay Schyler Raadt
- 75
- 10
0
votes
2 answers
simple random sampling while pulling data from warehouse(oracle engine) using proc sql in sas
I need to pull humongous amount of data, say 600-700 variables from different tables in a data warehouse...now the dataset in its raw form will easily touch 150 gigs - 79 MM rows and for my analysis purpose I need only a million rows...how can I…

Rohan
- 93
- 1
- 8
0
votes
1 answer
creating a stratified sample in SAS with known stratas
I have a target population with some characteristics and I have been asked to select an appropriate control based on these characteristics. I am trying to do a stratified sample using SAS base but I need to be able to define my 4 starta %s from my…

Annita
- 1
- 1