Questions tagged [frequency-distribution]

A frequency distribution is an arrangement of the values that one or more variables take in a sample

A frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.

220 questions
2
votes
3 answers

Frequency Distribution Comparison Python

I'm using python and nltk to study some texts and I want to compare the frequency distributions of parts of speech across the different texts. I can do it for one text: from nltk import * X_tagged =…
2
votes
1 answer

NLTK FreqDist no longer working

I'm relatively new to python and NLTK, but I wrote a program that used FreqDist from NLTK. It's been working as intended for the last week but today it returned: 'FreqDist' object has no attribute 'most_common' Does anyone have any idea as to why…
stackex
  • 21
  • 3
2
votes
1 answer

Remove missing values from frequency distributions in ggplot

My data dsL<-readRDS("./Data/Derived/dsL.rds") # color palette for the outcome attcol8<-c("Never"="#4575b4", "Once or Twice"="#74add1", "Less than once/month"="#abd9e9", "About once/month"="#e0f3f8", …
andrey
  • 2,029
  • 2
  • 18
  • 23
2
votes
1 answer

Get a spatial frequency breakdown for greyscale image (2d array)

I would like to get a plot of how much each spatial frequency is present in a grayscale image. I have been told to try np.fft.fft2 but apparently this is not what I need (according to this question). I was then told to look into np.fft.fftfreq - and…
TheChymera
  • 17,004
  • 14
  • 56
  • 86
2
votes
2 answers

R: how to get frequency count by each date and hour

I have a dataframe with four variables: "Period", "cell_id", "daterank", and "timerank". I would like to get a frequency of the cell id's (there are 115 unique levels (or cell_id's)) for each date and each hour by individual Period. "Period" is a…
phish_researcher
  • 73
  • 1
  • 1
  • 5
2
votes
3 answers

Calculating frequency distribution of a collection with .Net/C#

Is there a fast/simple way to calculate the frequency distribution of a .Net collection using Linq or otherwise? For example: An arbitrarily long List contains many repetitions. What's a clever way of walking the list and counting/tracking…
Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
2
votes
1 answer

Median of a frequency distribution

I want to calculate the median of a frequency distribution for a large number of samples. Each of the samples have a number of classes (3 in the example) and their respective frequencies. Each of the classes is associated with a different value data…
user12975
  • 121
  • 1
  • 2
  • 9
2
votes
1 answer

Applying functions from histograms - in R

I have a very basic grasp of stats, and a very basic grasp of R so please bear with me. I have survey data which shows the weekly expenditure of a number of respondents. I have put this into a histogram, and have plotted a density function as well.…
2
votes
1 answer

R compute percentage values in data frame

My question today refers to a data frame I'm working on in R. The header of the data frame looks like the following: String(unique), Integer N[0-23] Those 24 Integer values represent the frequency of the String associated with each hour of the day.…
deemel
  • 1,006
  • 5
  • 17
  • 34
2
votes
1 answer

Efficient way to get frequency distribution of values in a large MySql table

I have two tables viz. Total_Data and Distinct_S1. Total_Data has 3.5 million rows. Fields: "S1", "S2", "S3", "S4" Distinct_S1 has 1 million rows. Fields: "S1", "frequency". "S1" of Distinct_S1 consists of all distinct values which occur in "S1" of…
yang5
  • 1,125
  • 11
  • 16
2
votes
2 answers

Python NLTK FreqDist() Reduce Memory Usage By Writing k,v to disk?

I have a small program that uses NLTK to get the frequency distribution of a rather large dataset. The problem is that after a few million words I start to eat up all the RAM on my system. Here's what I believe to be the relevant lines of…
secumind
  • 1,141
  • 1
  • 17
  • 38
2
votes
1 answer

Python Frequency Distribution (FreqDist / NLTK) Issue

I'm attempting to break a list of words (a tokenized string) into each possible substring. I'd then like to run a FreqDist on each substring, to find the most common substring. The first part works fine. However, when I run the FreqDist, I get the…
Adam_G
  • 7,337
  • 20
  • 86
  • 148
2
votes
1 answer

Frequency and cumulative frequency curve on the same graph in R

Is there a way (in R with ggplot or otherwise) to draw frequency and cumulative frequency curves in a single column (two rows) i.e. one top of the other such that a given quartile can be shown on both the curves using straight lines? I hope I am…
Stat-R
  • 5,040
  • 8
  • 42
  • 68
1
vote
2 answers

Extracting time at which frequencies occur

I take a song sample and perform the FFT (fast Fourier transformation) on the sample. I am able to get the frequencies of the song, but I am not able to get the time at which those frequencies occur. So, it basically becomes useless as I have to…
1
vote
1 answer

Drawing frequency distribution of the daily maximum urban heat island intensity as a function of the time of day

I am pretty new to python. I am currently working with urban heat island intensity. The first 10 rows of the 3-hourly uhi intensity data is as follows, uhii = array([[ 1.9, 1.4, 1. , 0.6, 1.9, 0.6, 0.5, 2. ], [-2.1, -1.3, -3. , …
Abeda
  • 23
  • 5