Questions tagged [entropy]

Entropy is a measure of the uncertainty in a random variable.

The term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message. Entropy is typically measured in bits, nats, or bans. Shannon entropy is the average unpredictability in a random variable, which is equivalent to its information content.

596 questions
5
votes
1 answer

Understanding shannon entropy of a data set

I'm reading Machine Learning In Action and am going through the decision tree chapter. I understand that decision trees are built such that splitting the data set gives you a way to structure your branches and leafs. This gives you more likely…
devshorts
  • 8,572
  • 4
  • 50
  • 73
5
votes
1 answer

Weird output while finding entropy of frames of a video in opencv

#include #include #include #include #include #include using namespace std; typedef struct histBundle { double rCh[256]; double gCh[256]; double bCh[256]; }bundleForHist; bundleForHist…
Animesh Pandey
  • 5,900
  • 13
  • 64
  • 130
4
votes
2 answers

Networkx - entropy of subgraphs generated from detected communities

I have 4 functions for some statistical calculations in complex networks analysis. import networkx as nx import numpy as np import math from astropy.io import fits Degree distribution of graph: def degree_distribution(G): vk =…
8-Bit Borges
  • 9,643
  • 29
  • 101
  • 198
4
votes
1 answer

Limiting density of discrete points (LDDP) in python

Shannon's entropy from information theory measures the uncertainty or disorder in a discrete random variable's empirical distribution, while differential entropy measures it for a continuous r.v. The classical definition of differential entropy was…
develarist
  • 1,224
  • 1
  • 13
  • 34
4
votes
1 answer

Are Entropy results order-dependent when using SameTest

Mathematica's Entropy function is order-dependent when using the SameTest option. That is: Entropy[RandomSample[Range[11]], SameTest->(Abs[#1-#2]>1&) ] will give different results many times. I assume that this is because Entropy[] is in fact…
berniethejet
  • 236
  • 2
  • 11
4
votes
2 answers

Generate random number sequence with certain entropy

I need to generate partly random sequence of numbers such that the sequence overall has certain entropy level. E.g. if I would feed the generated data into gzip it would be able compress it. And in fact, this would be the exact application for the…
JATothrim
  • 842
  • 1
  • 8
  • 24
4
votes
2 answers

Calculating entropy from co-occurence matrix in Matlab

I am trying to extract the entropy from co-occurence matrices with zero entries in Matlab. From the definition of entropy of a co-occurence matrix: has to be calculated, where cij stands for the (i,j) entry of the co-occurence matrix. Thus it…
4
votes
2 answers

Can Pandas DataFrame efficiently calculate PMI (Pointwise Mutual Information)?

I've looked around and surprisingly haven't found an easy use of framework or existing code for the calculation of Pointwise Mutual Information (Wiki PMI) despite libraries like Scikit-learn offering a metric for overall Mutual Information (by…
jfive
  • 1,291
  • 3
  • 14
  • 21
4
votes
4 answers

Safe mixing of entropy sources

Let us assume we're generating very large (e.g. 128 or 256bit) numbers to serve as keys for a block cipher. Let us further assume that we wear tinfoil hats (at least when outside). Being so paranoid, we want to be sure of our available entropy, but…
Nicholas Knight
  • 15,774
  • 5
  • 45
  • 57
4
votes
1 answer

Why am I getting a negative information gain?

[SOLVED] My mistake was that I did not realise that entropy is 0 if all are of one type. Thus if all are positive, entropy is 0 and if all are negative it is zero as well. Entropy will be 1 if equal amount are positive and negative. It does not make…
Letholdrus
  • 1,261
  • 3
  • 20
  • 36
4
votes
2 answers

BigQuery: compute entropy of a column

I have a suggestion for the BQ folks: I think it would be very useful if there was a built-in function that would return the entropy of a column. A column of discrete categories or values would be relatively easy. Thoughts? Does this already…
SheRey
  • 305
  • 1
  • 5
  • 15
4
votes
2 answers

Log base 2 in Java for doubles

I'm trying to calculate the entropy of English using the following Java function public static void calculateEntropy() { for(int i = 0; i < letterFrequencies[i]; i++) { entropy += letterFrequencies[i] *…
Nick Gilbert
  • 4,159
  • 8
  • 43
  • 90
4
votes
3 answers

Does arithmetic on a random number reduce its entropy?

For instance, as I was reading this post on betterexplained.com, the author mentioned transforming a random number that lies in the range (0,1) to a random number that lies in the range (5,10) by multiplying and then adding by 5. Do operations such…
bmuk
  • 121
  • 5
4
votes
1 answer

Efficient calculation of entropy in spark

Given an RDD (data), and a list of index fields to calculate entropy on. When executing the following flow it takes approximately 5s to calculate a single entropy value on a 2MB (16k row) source. def entropy(data: RDD[Array[String]], colIdx:…
rapidninja
  • 281
  • 1
  • 5
  • 10
4
votes
2 answers

DPAPI + Entropy

We have a WPF app that allows our users to download encrypted content and we want to provide the ability to decrypt this content off-line. The idea is to download the keys and store them using the DPAPI but I'm having trouble with the entropy…
TWith2Sugars
  • 3,384
  • 2
  • 26
  • 43