Plot log(n over k)

Question

I've never used Matlab before and I really don't know how to fix the code. I need to plot log(1000 over k) with k going from 1 to 1000.

y = @(x) log(nchoosek(1000,x));

fplot(y,[1 1000]);

Error:

Warning: Function behaves unexpectedly on array inputs. To improve performance, properly
vectorize your function to return an output with the same size and shape as the input
arguments. 
In matlab.graphics.function.FunctionLine>getFunction
In matlab.graphics.function.FunctionLine/updateFunction
In matlab.graphics.function.FunctionLine/set.Function_I
In matlab.graphics.function.FunctionLine/set.Function
In matlab.graphics.function.FunctionLine
In fplot>singleFplot (line 241)
In fplot>@(f)singleFplot(cax,{f},limits,extraOpts,args) (line 196)
In fplot>vectorizeFplot (line 196)
In fplot (line 166)
In P1 (line 5)

`nchoosek` needs a scalar input. You should use a loop to compute its result for each `k` separately. But then again, I think your exercise is meant for you to not use `nchoosek`, as the intermediate values are humongous and will likely not fit in double-precision floating point numbers (or at least give wrong results). What is the equation for log(n over k)? — Cris Luengo, May 23 '19 at 15:10
this will help you https://www.tutorialspoint.com/matlab/matlab_plotting.htm — imbr, May 23 '19 at 15:12
Possible duplicate of [fitting and plotting a parabola, matlab](https://stackoverflow.com/questions/7550217/fitting-and-plotting-a-parabola-matlab) — imbr, May 23 '19 at 15:15
Not a duplicate at all. This question uses `fplot` (which takes an anonymous function as input), whereas the other one uses `plot` (which takes numeric vectors). Also, this question has potential numerical precision issues, due to the large numbers involved in the OP's approach — Luis Mendo, May 23 '19 at 15:20
@LuisMendo Considering that his data is all discrete, it would probably make more sense for him to use plot than fplot for this application. — John, May 23 '19 at 16:54

Luis Mendo · Accepted Answer · 2019-05-24T12:30:28.997

There are several problems with the code:

nchoosek does not vectorize on the second input, that is, it does not accept an array as input. fplot works faster for vectorized functions. Otherwise it can be used, but it issues a warning.
The result of nchoosek is close to overflowing for such large values of the first input. For example, nchoosek(1000,500) gives 2.702882409454366e+299, and issues a warning.
nchoosek expects integer inputs. fplot uses in general non-integer values within the specified limits, and so nchoosek issues an error.

You can solve these three issues exploiting the relationship between the factorial and the gamma function and the fact that Matlab has gammaln, which directly computes the logarithm of the gamma function:

n = 1000;
y = @(x) gammaln(n+1)-gammaln(x+1)-gammaln(n-x+1);
fplot(y,[1 1000]);

Note that you get a plot with y values for all x in the specified range, but actually the binomial coefficient is only defined for non-negative integers.

Excellent and clever exploitation of factorial and gamma function (+1). Also, I wasn't aware of the `gammaln` function. Thanks! — SecretAgentMan, May 23 '19 at 15:22

score 2 · Answer 2 · answered May 23 '19 at 15:23

2

OK, since you've gotten spoilers for your homework exercise anyway now, I'll post an answer that I think is easier to understand.

The multiplicative formula for the binomial coefficient says that

n over k = product_{i=1 to k}( (n+1-i)/i )

(sorry, no way to write proper formulas on SO, see the Wikipedia link if that was not clear).

To compute the logarithm of a product, we can compute the sum of the logarithms:

log(product(x_i)) = sum(log(x_i))

Thus, we can compute the values of (n+1-i)/i for all i, take the logarithm, and then sum up the first k values to get the result for a given k.

This code accomplishes that using cumsum, the cumulative sum. Its output at array element k is the sum over all input array elements from 1 to k.

n = 1000;
i = 1:1000;
f = (n+1-i)./i;
f = cumsum(log(f));
plot(i,f)

Note also ./, the element-wise division. / performs a matrix division in MATLAB, and is not what you need here.

answered May 23 '19 at 15:23

Cris Luengo

55,762
10
62
120

This is clever. I hadn't considered the *sum of logs* approach but makes perfect since now that you spell it out. Very crisp, clean approach. (+1) – SecretAgentMan May 23 '19 at 15:27
Other SE site support using MathJax to render LaTex. This seems like it would be a very useful feature on SO as well. – John May 23 '19 at 17:00
@John: That suggestion has been discussed and shot down on Meta multiple times. They don't want to do it, unfortunately. – Cris Luengo May 23 '19 at 17:01
1

@CrisLuengo Yeah I just went down a rabbit hole investigating this. Seems like to comes up a lot and we just live with ugly typesetting and inefficient communication because reasons... – John May 23 '19 at 17:06

score 2 · Answer 3 · answered May 23 '19 at 16:22

2

syms function type reproduces exactly what you want

syms x

y = log(nchoosek(1000,x));
fplot(y,[1 1000]);

answered May 23 '19 at 16:22

Adam

2,726
1
9
22

SecretAgentMan · Answer 4 · 2019-05-23T16:34:06.970

This solution uses arrayfun to deal with the fact that nchoosek(n,k) requires k to be a scalar. This approach requires no toolboxes.

Also, this uses plot instead of fplot since this clever answer already addresses how to do with fplot.

% MATLAB R2017a
n = 1000;
fh=@(k) log(nchoosek(n,k));  
K = 1:1000;
V = arrayfun(fh,K);    % calls fh on each element of K and return all results in vector V

plot(K,V)

Note that for some values of k greater than or equal to 500, you will receive the warning

Warning: Result may not be exact. Coefficient is greater than 9.007199e+15 and is only accurate to 15 digits

because nchoosek(1000,500) = 2.7029e+299. As pointed out by @Luis Mendo, this is due to realmax = 1.7977e+308 which is the largest real floating-point supported. See here for more info.

Plot log(n over k)

4 Answers4

Linked