Questions tagged [probability]

Consider if your question would be better at stats.stackexchange.com. Probability touches upon uncertainty, random phenomena, random numbers, random variables, probability distributions, sampling, combinatorics.

See also https://statistics.stackexchange.com

Probability theory is a branch of mathematics that studies uncertainty and random phenomena. It operates by introducing a sample space (a set), and associating probabilities (numbers between 0 and 1, inclusive) to certain subsets of this set, in a manner that satisfies some sensible axioms. If the sample space can be thought of as the real line, we obtain random variables; if it is a Euclidean space, we obtain random vectors. Random variables and random vectors have associated probability distributions, which can be characterized by probability density functions, cumulative density functions, moments, characteristic or moment generating functions.

Typically, questions with this tag will deal with computing (exactly or approximately) probabilities of certain events (from winning a lottery to server outages), drawing random samples, approximating distributions, etc. There might be an overlap with statistics and/or statistical packages (R, SAS, Stata).

Synonym: probability-theory

4021 questions
1
vote
2 answers

R: select a subset based on probability

I'm new to R. I have a normal distribution. n <- rnorm(1000, mean=10, sd=2) As an exercise I'd like to create a subset based on a probability curve derived from the values. E.g for values <5, I'd like to keep random 25% entries, for values >15, I'd…
MrSparkly
  • 627
  • 1
  • 7
  • 17
1
vote
1 answer

Probabilistic Record Linkage in Pandas

I have two dataframes (X & Y). I would like to link them together and to predict the probability that each potential match is correct. X = pd.DataFrame({'A': ["One", "Two", "Three"]}) Y = pd.DataFrame({'A': ["One", "To", "Free"]})
R. Cox
  • 819
  • 8
  • 25
1
vote
1 answer

Octave generating a random number with known probability

I want to generate a random number with range and with a given probability in octave but I'm not sure how to: 0.5 chance of 1 - 50 0.3 chance of 51 - 80 0.2 chance of 81 - 100 thx
N. Y
  • 11
  • 3
1
vote
1 answer

Probability of selecting exactly n elements

I have a list of about 100 000 probabilities on an event stored in a vector. I want to know if it is possible to calculate the probability of n occuring events (e.g. what is the probability that exactly 1000 events occur). I managed to calculate…
MrLoedus
  • 11
  • 2
1
vote
1 answer

Spark MultilayerPerceptronClassifier Class Probabilities

I am an experienced Python programmer trying to transition some Python code to Spark for a classification task. This is my first time working in Spark/Scala. In Python, both Keras/tensorflow and sci-kit Learn neural networks do a great job on the…
RKB
  • 73
  • 1
  • 11
1
vote
1 answer

How to generate a probability distribution on an image

I have a question as follows: Suppose I have an image(size=360x640(row by col)), and I have a center coordinate that's say is (20, 100). What I want is to generate a probability distribution that has the highest value in that center (20,100), and…
HenryChen
  • 153
  • 3
  • 12
1
vote
1 answer

Iterate two arguments with map2 (purrr function)

I want to calculate all possible predictions with different probabilities of my data with multiple models. The result is a list. df<-iris df$y<-sample(0:1,nrow(df),replace=TRUE) set.seed(101) #Now Selecting 80% of data as sample from total 'n' rows…
liguang
  • 161
  • 1
  • 9
1
vote
1 answer

How to plot histogram of simulated geometric random variables using Python?

I must simulate 100,000 geometric random variables with parameter p = 0.01 and plot the results on a histogram, with buckets for each of the values 1 to 1000. What are buckets and how to I create the histogram? This is what I have so far. p = 0.01 n…
Chance Gordon
  • 143
  • 1
  • 8
1
vote
0 answers

How to fix "Error in xy.coords(x, y, setLab = FALSE) : 'x' and 'y' lengths differ" in convpow for narrow uniform distribution?

I'm using the distr package in R for a research project that involves the convolution of multiple i.i.d. uniform random variables. The details of the project are not important, but sometimes the min and max of the distributions are very close to…
fraz1
  • 11
  • 2
1
vote
1 answer

Selecting two numbers from a list in python with a probability that decays as the relative distance between them

I am trying to take a list, and from it choose a number i randomly. Following which, I want to select a second element j. The probability of choosing a j decays as 1/|i-j|. For example, the relative probability of it choosing a j four steps away…
jcp
  • 249
  • 1
  • 10
1
vote
0 answers

Working on a probability & provably fair chest opening function, am I thinking right?

So I'm working on a "chest opening simulator" with a client of mine and I have completed the whole system except for the actual probability part. He sent me this calculation and algorithm for how the different rarities and items should…
Edwin
  • 51
  • 1
  • 7
1
vote
0 answers

Simulate random vector conditionally on a subdomain

Suppose i have a bivariate random vector which i can simulate from, taking values in a given domain, and for the sake of simplicity let's suppose that it takes values in whole $R^2$. Suppose now that the density of my random vector in a given…
lrnv
  • 1,038
  • 8
  • 19
1
vote
1 answer

How to compute a reasonable number of bits for a checksum?

I have around 1500 bytes of data that I want to construct a checksum for so that if the data gets corrupted the chances of the checksum still matching the data is less than say 1 in 10^15, i.e. a low enough probability that I can treat it as it is…
WilliamKF
  • 41,123
  • 68
  • 193
  • 295
1
vote
2 answers

Using a number array to adjust the chances of a specific return value?

I am creating a method called getRandomLetter() that I want to randomly return a single letter from the English alphabet. I want the method to be more likely to return a letter with a higher number associated with it. I have two arrays. One has…
LuminousNutria
  • 1,883
  • 2
  • 18
  • 44
1
vote
1 answer

Fit data to other formulations of Gumbel and Weibull models in Matlab

I need to fit Extreme Value distributions to wind speed data. I'm using Matlab for doing this. It may not be evident to a user that there are alternative formulations of the Gumbel and Weibull models than those that Matlab has built in in its…
Oliver Amundsen
  • 1,491
  • 2
  • 21
  • 40
1 2 3
99
100