3

I attached image: alt text
(source: piccy.info)

So in this image there is a diagram of the function, which is defined on the given points. For example on points x=1..N.

Another diagram, which was drawn as a semitransparent curve, That is what I want to get from the original diagram, i.e. I want to approximate the original function so that it becomes smooth.

Are there any methods for doing that?

I heard about least squares method, which can be used to approximate a function by straight line or by parabolic function. But I do not need to approximate by parabolic function. I probably need to approximate it by trigonometric function. So are there any methods for doing that? And one idea, is it possible to use the Least squares method for this problem, if we can deduce it for trigonometric functions?

One more question! If I use the discrete Fourier transform and think about the function as a sum of waves, so may be noise has special features by which we can define it and then we can set to zero the corresponding frequency and then perform inverse Fourier transform. So if you think that it is possible, then what can you suggest in order to identify the frequency of noise?

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
maximus
  • 4,201
  • 15
  • 64
  • 117
  • I don't know which is better to use, sine, cosine or polynomial. Yes I know about splines, but I used them for parabolic functions only. I didn't try approximating by trigonometric function, yet. – maximus Jan 07 '10 at 14:53
  • Actually I want to binarize the image line. Without using any binarization methods like Otsu, Bernsen and so on - which use grayscale image data and some threshold value for binarization. – maximus Jan 07 '10 at 14:55
  • And one more to say - I want a fast method. Time is important. – maximus Jan 07 '10 at 14:57
  • IMHO, this looks like a (badly :( drawn square wave function with a few harmonics. So, I'd go with looking out on google on how fourier coefficients for f. series are determined. – Rook Jan 07 '10 at 15:55
  • No, I don't mean that output curve must have 0 and 1 values. I want a smooth curve to be as an output. After getting that smooth curve, I can for example identify local minimums and maximums of it, and calculate for each pixel of image line, containing that curve, the local threshold value which must be average value of some local minimums and maximums around pixel. – maximus Jan 07 '10 at 16:32
  • If image contains some shadows this method must help. And it is more accurate I think, than making binarization of the whole image. – maximus Jan 07 '10 at 16:37
  • No I don't wanna fit the curve to the original image, but I want to approximate the curve, get smoothed curve and use it to binarize the image line. SO, I if I don't do so, I will have many local maxima and minimum, that is not good I think, but I can smooth image using dilation and erosion. And then I need a smooth curve to construct good threshold line. I attach one more picture, it has image line that which has some shadows. And also has a threshold line. I marked local maximas and minimums by black points. – maximus Jan 08 '10 at 17:27
  • So to construct the threshold line, I need to take a few local maximums and minimums around that point and calculate the average value. It will be the threshold value for each pixel. – maximus Jan 08 '10 at 17:28
  • There are some problems, if we have an abrupt change in wave height, the average value among the local extrema will not be accurate. It will be overvalued or undervalued – maximus Jan 08 '10 at 17:39

5 Answers5

9

Unfortunately many solutions here presented don't solve the problem and/or they are plain wrong. There are many approaches and they are specifically built to solve conditions and requirements you must be aware of !

a) Approximation theory: If you have a very sharp defined function without errors (given by either definition or data) and you want to trace it exactly as possible, you are using polynominal or rational approximation by Chebyshev or Legendre polynoms, meaning that you approach the function by a polynom or, if periodical, by Fourier series.

b) Interpolation: If you have a function where some points (but not the whole curve!) are given and you need a function to get through this points, you can use several methods:

Newton-Gregory, Newton with divided differences, Lagrange, Hermite, Spline

c) Curve fitting: You have a function with given points and you want to draw a curve with a given (!) function which approximates the curve as closely as possible. There are linear and nonlinear algorithms for this case.

Your drawing implicates:

  • It is not remotely like a mathematical function.
  • It is not sharply defined by data or function
  • You need to fit the curve, not some points.

What do you want and need is

d) Smoothing: Given a curve or datapoints with noise or rapidly changing elements, you only want to see the slow changes over time.

You can do that with LOESS as Jacob suggested (but I find that overkill, especially because choosing a reasonable span needs some experience). For your problem, I simply recommend the running average as suggested by Jim C.

http://en.wikipedia.org/wiki/Running_average

Sorry, cdonner and Orendorff, your proposals are well-minded, but completely wrong because you are using the right tools for the wrong solution.

These guys used a sixth polynominal to fit climate data and embarassed themselves completely.

http://scienceblogs.com/deltoid/2009/01/the_australians_war_on_science_32.php

http://network.nationalpost.com/np/blogs/fullcomment/archive/2008/10/20/lorne-gunter-thirty-years-of-warmer-temperatures-go-poof.aspx

Thorsten S.
  • 4,144
  • 27
  • 41
  • Excellent, accurate response! – Alex Budovski Jan 09 '10 at 03:21
  • I will try every thing step by step! Now I want to try to calculate at each pixel of the image line the threshold value. Which will be calculated by average value of the left extreme value and right extreme value. This is not good if image line is not a smooth curve. But I made smoothing to the image line, using smooth filter, and line now seems to be a smooth curve, but what if we don't get the smooth curve after using smooth filter? In this case this algorithm will be good if only we provide a good image line smoothing algorithm, i mean getting a good smoothed line. – maximus Jan 11 '10 at 14:19
  • If we get the luminance value of each pixel of the source image we get Y(x)=a*R(x)+b*G(x)+c*B(x) where the a,b and c can be differently set. Then, after we use smooth filter to the image line and calculate all extreme values of the image line - we can take only those extreme values, that have difference |L(x') - L(x'')| > epsilon where epsilon is set to some small value. So we will ignore not important(noisy) wave changes. I got that from some algorithm which is also used to constract a threshold line, but I didn't check it yet, and I don't know if this algorithm will always give good results. – maximus Jan 11 '10 at 14:29
  • +1 In my defense, the question says, "I probably need to approximate it by trigonometric function. So are there any methods for doing that?" But I think your answer is right: curve fitting is not really what maximus needs, even if it is what he initially asked for... – Jason Orendorff Jan 16 '10 at 21:52
3

Use loess in R (free).

E.g. here the loess function approximates a noisy sine curve.

sine
(source: stowers-institute.org)

As you can see you can tweak the smoothness of your curve with span

Here's some sample R code from here:

Step-by-Step Procedure

Let's take a sine curve, add some "noise" to it, and then see how the loess "span" parameter affects the look of the smoothed curve.

  1. Create a sine curve and add some noise:

    period <- 120 x <- 1:120 y <- sin(2*pi*x/period) + runif(length(x),-1,1)

  2. Plot the points on this noisy sine curve:

    plot(x,y, main="Sine Curve + 'Uniform' Noise") mtext("showing loess smoothing (local regression smoothing)")

  3. Apply loess smoothing using the default span value of 0.75:

    y.loess <- loess(y ~ x, span=0.75, data.frame(x=x, y=y))

  4. Compute loess smoothed values for all points along the curve:

    y.predict <- predict(y.loess, data.frame(x=x))

  5. Plot the loess smoothed curve along with the points that were already plotted:

    lines(x,y.predict)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Jacob
  • 34,255
  • 14
  • 110
  • 165
  • what conditions must be met in order to say that the desired result is obtained? I mean desired curve. If the picture has a very small difference in color at some places, then the line corresponding to that place of image will be perceived as noise. I wonder how these methods work on cell phones. So far, I can not find a solution, although the algorithm is known. If we have the ideal barcode then it is all right, but if it contains noise, then the whole thing to waste. – maximus Jan 13 '10 at 15:05
2

You could use a digital filter like a FIR filter. The simplest FIR filter is just a running average. For more sophisticated treatment look a something like a FFT.

Jim C
  • 4,981
  • 21
  • 25
2

This is called curve fitting. The best way to do this is to find a numeric library that can do it for you. Here is a page showing how to do this using scipy. The picture on that page shows what the code does:

graph showing two noisy data sets and two best-fit sine curves
(source: scipy.org)

Now it's only 4 lines of code, but the author doesn't explain it at all. I'll try to explain briefly here.

First you have to decide what form you want the answer to be. In this example the author wants a curve of the form

f(x) = p0 cos (2π/p1 x + p2) + p3 x

You might instead want the sum of several curves. That's OK; the formula is an input to the solver.

The goal of the example, then, is to find the constants p0 through p3 to complete the formula. scipy can find this array of four constants. All you need is an error function that scipy can use to see how close its guesses are to the actual sampled data points.

fitfunc = lambda p, x: p[0]*cos(2*pi/p[1]*x+p[2]) + p[3]*x # Target function
errfunc = lambda p: fitfunc(p, Tx) - tX # Distance to the target function

errfunc takes just one parameter: an array of length 4. It plugs those constants into the formula and calculates an array of values on the candidate curve, then subtracts the array of sampled data points tX. The result is an array of error values; presumably scipy will take the sum of the squares of these values.

Then just put some initial guesses in and scipy.optimize.leastsq crunches the numbers, trying to find a set of parameters p where the error is minimized.

p0 = [-15., 0.8, 0., -1.] # Initial guess for the parameters
p1, success = optimize.leastsq(errfunc, p0[:])

The result p1 is an array containing the four constants. success is 1, 2, 3, or 4 if ths solver actually found a solution. (If the errfunc is sufficiently crazy, the solver can fail.)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Jason Orendorff
  • 42,793
  • 6
  • 62
  • 96
1

This looks like a polynomial approximation. You can play with polynoms in Excel ("Add Trendline" to a chart, select Polynomial, then increase the order to the level of approximation that you need). It shouldn't be too hard to find an algorithm/code for that. Excel can show the equation that it came up with for the approximation, too.

cdonner
  • 37,019
  • 22
  • 105
  • 153
  • I tried polinomial approximation with 6 order, it gave me not so good result, may be I should try to increase the order... – maximus Jan 08 '10 at 17:33