3

I am trying to create a library. Lets say I have a model where I have a equation that outputs, input, and a describe function. The inputs would be:

x= [1,2,3,4,5,6]
y= [5,2,4,8,9,2]

And I put it into a function:

#=returns y values=#
function fit (x,a,b)
    y=ax+b
end

And another that outputs a summary using a describe function:

#=Describes equation fitting=#

function describe(#insert some model with data from the previous functions)
   #=Prints the following: Residuals(y-fit(y)), x and y and r^2
     a and b

     =#
end

What is the best way of doing this in Julia? Should I use type?

Currently I am using a very large function such as:

function Model(x,y,a,b,describe ="yes")
    ....function fit
    ....Something with if statements to controls the outputs
    ....function describe
end

But this is not very efficient or user friendly.

ccsv
  • 8,188
  • 12
  • 53
  • 97
  • I don't think I understand what you are trying to do. From `function describe(m::Model)` it seems that you already have a model type? Is the last function the constructor for that type? – spencerlyon2 Nov 08 '14 at 18:17
  • @spencerlyon2 That is just pseudocode. Let me rewrite it so it is more clear – ccsv Nov 09 '14 at 00:31
  • @spencerlyon2 Basically I am trying to find a way to rewrite a function so I would not have to do something like `Model(1,2,4,5,"no","false"...etc)` so there can be individual commands for describing, or implementing different functions on a dataset. – ccsv Nov 09 '14 at 00:38
  • The natural thing to do would be to define a `type` for the `Model` and then write various methods (describe, implement function f) for the type. I'm afraid without a bit more code to work off of I can't be more concrete than that. The sections on [types](http://julia.readthedocs.org/en/latest/manual/types/) and [methods](http://julia.readthedocs.org/en/latest/manual/methods/) from the manual are the relevant pieces of documentation. – spencerlyon2 Nov 09 '14 at 00:40
  • @spencerlyon2 Yes that is true but what elements are best to put in the type? I am not very good at this and the documentations are a bit confusing. – ccsv Nov 09 '14 at 02:36

2 Answers2

8

It seems like you are trying to shoehorn a specific OOP style onto Julia that is not really a good fit. Julia does not have classes. Instead you use a combination of types, functions that dispatch on those types, and modules that encapsulate the whole.

As a made up example lets make a package that does OLS regression. In order to encapsulate the code you wrap it in a module. Lets call it OLSRegression:

module OLSRegression

end

We need a model to store the results of the regression and to dispatch on:

type OLS
    a::Real
    b::Real
end

We then need a function to fit our OLS to data. Rather than creating out own fit function we can extend the one available in StatsBase.jl:

using StatsBase

function StatsBase.fit(::Type{OLS}, x, y)
    a, b = linreg(x, y)
    OLS(a, b)
end

Then we can create a describe function to print out the fitted model:

function describe(obj::OLS)
    println("The model fit is y = $(obj.a) + $(obj.b) * x")
end

Lastly, we need to export the created types and functions from the module:

export OLS, describe, fit

The whole module put together is:

module OLSRegression

using StatsBase

export OLS, describe, fit

type OLS <: RegressionModel
    a::Real
    b::Real
end

function StatsBase.fit(::Type{OLS}, x, y)
    a, b = linreg(x, y)
    OLS(a, b)
end

function describe(obj::OLS)
    println("The model fit is y = $(obj.a) + $(obj.b) * x")
end

end

You would then use it like this:

julia> using OLSRegression

julia> m = fit(OLS, [1,2,5,4], [2,2,4,6])

julia> describe(m)
The model fit is y = 1.1000000000000005 + 0.7999999999999999 * x

EDIT: Let me add some comments on methods, multiple dispatch and shadowing.

In a traditional OOP language you can have different objects that have methods with the same name. For example: we have the object dog and the object cat. They both have a method called run. I can call the appropriate run method with the dot syntax: dog.run() or cat.run(). This is single dispatch. The appropriate method is called based on the type of the first argument. Because of the importance of the first argument it appears before the method name instead of inside the parentheses.

In Julia this sort of dot syntax for calling methods, but it still has dispatch. Instead the first argument appears inside the parentheses just like all the other arguments. So you would do run(dog) or run(cat) and it still dispatches to the appropriate method for the dog or cat type.

This is also what is happening with describe(obj::OLS). I'm creating a new method describe and specifying that this method should be called when the first parameter is of type OLS.

Julia's dispatch goes beyond single dispatch to multiple dispatch. In single dispatch the calls cat.run("fast") and cat.run(5) would dispatch to the same method and it is up to the method to do different things with the different types of the second parameter. In Julia run(cat, "fast") and run(cat, 5) dispatch to separate methods.

I've seen the creators of Julia call it a verb language and traditional OOP languages noun languages. In noun languages you attach methods to objects (nouns), but in Julia you attach methods to generic functions (verbs). In the module above I'm both creating a new generic function describe (because there is no generic function of that name) and attaching a method on it that dispatches on OLS types.

What I am doing with the fit function is that rather than creating a new generic function called fit I am importing it from the StatsBase package and adding a new method for fitting our OLS type. Now both my fit method and any other fit methods in other packages get dispatched to when called with the rights types of arguments. The reason I am doing this is because if I created a new fit function it would shadow the one in StatsBase. For functions you export in Julia it is generally better to extend and existing canonical generic function rather than create your own and risk shadowing a function in base or some other package.

If some other package exported their own describe generic function and was loaded after our OLSRegression package that would make the command describe(m) would error. We could still access our describe function with a fully qualified name, ie OLSRegression.describe.

EDIT2: Regarding the ::Type{OLS} stuff.

In the function call fit(OLS, [1,2,5,4], [2,2,4,6]) OLS is called without parenthesis, which means I'm not constructing an instance of the OLS type and passing that to the function, but instead I'm passing the type itself to the method.

In obj::OLS the ::OLS part specifies that the object should be an instance of the type OLS. The obj before that is the name I am binding that instance to for us in the function body. ::Type{OLS} is different in two ways. Rather than specifying that the argument should be a instance of the type OLS, it specifies that the argument should be a instance of Type, parametrized with OLS. There is nothing before the colons because I am not binding it to any variable name, because I don't need to use it in the function body.

The reason I am doing this is simply to help disambiguate between different methods of fit. Some other package might also be extending the fit function in StatsBase. If we both use a function signature like StatsBase.fit(x, y) Julia wouldn't know which method to dispatch to. Instead if I use a function signature like StatsBase.fit(::Type{OLS}, x, y) and the other package did something like StatsBase.fit(::Type{NLLS}, x, y), then the methods disambiguate, and the user can pass the type as the first parameter to specify which method he wants.

Mr Alpha
  • 1,813
  • 1
  • 16
  • 26
  • Because of how julia is set up I am guessing `function describe` will only work for `obj::OLS` so if I have an function describe from another package I would indicate I am using this one by `OLSRegression.describe()` – ccsv Nov 09 '14 at 20:09
  • @ccsv I added some comments that should hopefully clear it up. – Mr Alpha Nov 11 '14 at 18:51
  • Nice answer. Quick question: why is it that for the `.fit` function you need to put `::Type{OLS}`, but for the describe you put `obj::OLS`? In the latter case it works like an input filter for the dispatch, but I can't figure our the syntax for the first one...OLS doesn't really work like an input there. – cd98 Nov 15 '14 at 15:45
  • @ccsv I added even more stuff to the answer. Hope it helps. – Mr Alpha Nov 15 '14 at 18:03
1

Matlab tends to encourage one monolithic function, but in Julia it's almost always better to break things up into smaller pieces. I'm not sure I really understand what you're trying to do, but as for documentation, do check out Docile and Lexicon (which work as a pair).

tholy
  • 11,882
  • 1
  • 29
  • 42