19

Aside from parsing the function file, is there a way to get the names of the input and output arguments to a function in matlab?

For example, given the following function file:

divide.m

function [value, remain] = divide(left, right)
     value = floor(left / right);
     remain = left / right - value;
end

From outside the function, I want to get an array of output arguments, here: ['value', 'remain'], and similarly for the input arguments: ['left', 'right'].

Is there an easy way to do this in matlab? Matlab usually seems to support reflection pretty well.

EDIT Background:

The aim of this is to present the function parameters in a window for the user to enter. I'm writing a kind of signal processing program, and functions to perform operations on these signals are stored in a subfolder. I already have a list and the names of each function from which the user can select, but some functions require additional arguments (e.g. a smooth function might take window size as a parameter).

At the moment, I can add a new function to the subfolder which the program will find, and the user can select it to perform an operation. What I'm missing is for the user to specify the input and output parameters, and here I've hit the hurdle here in that I can't find the names of the functions.

Hannesh
  • 7,256
  • 7
  • 46
  • 80
  • isn't that the reason you have the function open command? – Rasman May 03 '12 at 12:56
  • inside the function itself, or outside? I assume outside, since that makes it trivial to use. – Gunther Struyf May 24 '12 at 15:23
  • @Hannesh Do you mean you want the variable names from the function declaration itself, as it appears in the implementation? – Eitan T May 24 '12 at 15:34
  • @EitanT Yeah. The runtime must know the names to be able to create the variables on function call, so I figure there must be a way to access them. – Hannesh May 24 '12 at 20:31
  • probably the only way is to parse the files. Try if you can get anything with checkcode. – sivann May 24 '12 at 21:40

6 Answers6

12

MATLAB offers a way to get information about class metadata (using the meta package), however this is only available for OOP classes not regular functions.

One trick is to write a class definition on the fly, which contain the source of the function you would like to process, and let MATLAB deal with the parsing of the source code (which can be tricky as you'd imagine: function definition line spans multiple lines, comments before the actual definition, etc...)

So the temporary file created in your case would look like:

classdef SomeTempClassName
    methods
        function [value, remain] = divide(left, right)
            %# ...
        end
    end
end

which can be then passed to meta.class.fromName to parse for metadata...


Here is a quick-and-dirty implementation of this hack:

function [inputNames,outputNames] = getArgNames(functionFile)
    %# get some random file name
    fname = tempname;
    [~,fname] = fileparts(fname);

    %# read input function content as string
    str = fileread(which(functionFile));

    %# build a class containing that function source, and write it to file
    fid = fopen([fname '.m'], 'w');
    fprintf(fid, 'classdef %s; methods;\n %s\n end; end', fname, str);
    fclose(fid);

    %# terminating function definition with an end statement is not
    %# always required, but now becomes required with classdef
    missingEndErrMsg = 'An END might be missing, possibly matching CLASSDEF.';
    c = checkcode([fname '.m']);     %# run mlint code analyzer on file
    if ismember(missingEndErrMsg,{c.message})
        % append "end" keyword to class file
        str = fileread([fname '.m']);
        fid = fopen([fname '.m'], 'w');
        fprintf(fid, '%s \n end', str);
        fclose(fid);
    end

    %# refresh path to force MATLAB to detect new class
    rehash

    %# introspection (deal with cases of nested/sub-function)
    m = meta.class.fromName(fname);
    idx = find(ismember({m.MethodList.Name},functionFile));
    inputNames = m.MethodList(idx).InputNames;
    outputNames = m.MethodList(idx).OutputNames;

    %# delete temp file when done
    delete([fname '.m'])
end

and simply run as:

>> [in,out] = getArgNames('divide')
in = 
    'left'
    'right'
out = 
    'value'
    'remain'
Amro
  • 123,847
  • 25
  • 243
  • 454
  • This looks interesting. I'll have to play around a bit with it. – gnovice May 25 '12 at 15:13
  • @Amro What if the function is declared with variadic input/output arguments, _i.e._ with varargin and/or varargout? – Eitan T May 25 '12 at 15:35
  • @EitanT: it will simply return varargin and/or varargout. Another route would be to write "doc comments" in each function that are easily distinguishable (something like javadocs' `@param` and `@return`), and parse those using regexp as gnovice showed in his answer – Amro May 25 '12 at 22:14
11

If your problem is limited to the simple case where you want to parse the function declaration line of a primary function in a file (i.e. you won't be dealing with local functions, nested functions, or anonymous functions), then you can extract the input and output argument names as they appear in the file using some standard string operations and regular expressions. The function declaration line has a standard format, but you have to account for a few variations due to:

(It turns out that accounting for a block comment was the trickiest part...)

I've put together a function get_arg_names that will handle all the above. If you give it a path to the function file, it will return two cell arrays containing your input and output parameter strings (or empty cell arrays if there are none). Note that functions with variable input or output lists will simply list 'varargin' or 'varargout', respectively, for the variable names. Here's the function:

function [inputNames, outputNames] = get_arg_names(filePath)

    % Open the file:
    fid = fopen(filePath);

    % Skip leading comments and empty lines:
    defLine = '';
    while all(isspace(defLine))
        defLine = strip_comments(fgets(fid));
    end

    % Collect all lines if the definition is on multiple lines:
    index = strfind(defLine, '...');
    while ~isempty(index)
        defLine = [defLine(1:index-1) strip_comments(fgets(fid))];
        index = strfind(defLine, '...');
    end

    % Close the file:
    fclose(fid);

    % Create the regular expression to match:
    matchStr = '\s*function\s+';
    if any(defLine == '=')
        matchStr = strcat(matchStr, '\[?(?<outArgs>[\w, ]*)\]?\s*=\s*');
    end
    matchStr = strcat(matchStr, '\w+\s*\(?(?<inArgs>[\w, ]*)\)?');

    % Parse the definition line (case insensitive):
    argStruct = regexpi(defLine, matchStr, 'names');

    % Format the input argument names:
    if isfield(argStruct, 'inArgs') && ~isempty(argStruct.inArgs)
        inputNames = strtrim(textscan(argStruct.inArgs, '%s', ...
                                      'Delimiter', ','));
    else
        inputNames = {};
    end

    % Format the output argument names:
    if isfield(argStruct, 'outArgs') && ~isempty(argStruct.outArgs)
        outputNames = strtrim(textscan(argStruct.outArgs, '%s', ...
                                       'Delimiter', ','));
    else
        outputNames = {};
    end

% Nested functions:

    function str = strip_comments(str)
        if strcmp(strtrim(str), '%{')
            strip_comment_block;
            str = strip_comments(fgets(fid));
        else
            str = strtok([' ' str], '%');
        end
    end

    function strip_comment_block
        str = strtrim(fgets(fid));
        while ~strcmp(str, '%}')
            if strcmp(str, '%{')
                strip_comment_block;
            end
            str = strtrim(fgets(fid));
        end
    end

end
gnovice
  • 125,304
  • 15
  • 256
  • 359
  • A couple of things why this wouldn't work in general: for starters the one you mentioned yourself. 2. There could also be whitespace before the header. 3. The function to parse could be a function in another function file (or even nested). 4. Or an anonymous function, from which hasn't got output argument names. But sure nice effort, I was doing the same thing, until I stumped across above problems :p – Gunther Struyf May 25 '12 at 06:18
  • 1
    @GuntherStruyf: I corrected many of the limitations. It still only operates on just the primary function, but I don't think that's too big a deal since you can't call subfunctions or nested functions from *outside* the file anyway (unless you start messing around with function handles). – gnovice May 25 '12 at 15:12
  • What if the function is declared with [variadic input/output arguments](http://www.mathworks.com/help/techdoc/ref/varargin.html), _i.e._ with `varargin` and/or `varargout`? – Eitan T May 25 '12 at 15:33
  • @EitanT: I mention that in the second paragraph. Since there is no variable name in that case, it just lists `'varargin'` or `'varargout'`. – gnovice May 25 '12 at 15:41
  • @gnovice Oh, sorry. I missed that. – Eitan T May 25 '12 at 15:43
  • @gnovice You could further improve your function to receive the function name as a string (say `functionNameStr`) and converting it into a filename using `filePath = which(functionNameStr);`. Seems to me a bit more "natural" this way... – Eitan T May 25 '12 at 16:09
  • @EitanT: That has one drawback: if the function is overloaded. The call to [WHICH](http://www.mathworks.com/help/techdoc/ref/which.html) might not pick the one you want, so it's safer to get the path first (using WHICH with arguments, or some other method) then pass it to the above function. – gnovice May 25 '12 at 16:19
  • 1
    @gnovice I actually think it is a desired behavior for `which` to pick the default function (if it's overloaded). And if not, you'll have to specify a more accurate name. For instance, `conv` is overloaded with `gf/conv`, so if you want to parse the latter you'll have to specify `gf/conf` instead of just `conv`. IMHO, this is not a problem. – Eitan T May 25 '12 at 16:25
  • 1
    If the OP has control over the functions, it might be simpler to have custom delimiters in the comments and include the in/out variables in an easy to parse way. Something like `%#!@ input: foo bar output: baz @!#`, where `#!@` denotes start and `@!#` (reversed) denotes the end. The input and output tokens end with `:` and the space delimited strings in between are the variables... – abcd May 26 '12 at 00:32
  • So I guess the easiest way is to parse it, thanks for this function!! – Hannesh May 30 '12 at 17:25
  • It's not that hard to parse, the hard part is dealing with built-in functions – Rick Minerich May 24 '13 at 19:17
3

This is going to be very hard (read: impossible) to do for general functions (think of things like varargin, etc). Also, in general, relying on variable names as a form of documentation might be... not what you want. I'm going to suggest a different approach.

Since you control the program, what about specifying each module not just with the m-file, but also with a table entry with extra information. You could document the extra parameters, the function itself, notate when options are booleans and present them as checkboxes, etc.

Now, where to put this? I would suggest to have the main m-file function return the structure, as sort of a module loading step, with a function handle that points to the subfunction (or nested function) that does the real work. This preserves the single-file setup that I'm sure you want to keep, and makes for a much more configurable setup for your modules.

function module = divide_load()
    module.fn = @my_divide;
    module.name = 'Divide';
    module.description = 'Divide two signals';
    module.param(1).name = 'left';
    module.param(1).description = 'left signal';
    module.param(1).required_shape = 'columnvector';
    % Etc, etc.

    function [value, remain] = my_divide(left, right)
         value = floor(left / right);
         remain = left / right - value;
    end
end
Peter
  • 14,559
  • 35
  • 55
1

When you can't get information from a programming langauge about its contents (e.g., "reflection"), you have to step outside the language.

Another poster suggested "regular expressions", which always fail when applied to parsing real programs because regexps cannot parse context free langauges.

To do this reliably, you need a real M language parser, that will give you access to the parse tree. Then this is fairly easy.

Our DMS Software Reengineering Toolkit has an M language parser available for it, and could do this.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • I agree that regular expressions will generally fail when trying to parse an *entire program*, but the OP only wants to parse the *function definition line*, which has a standard format and limited number of variations. A combinations of standard string operations and regex can handle that case quite well. – gnovice May 29 '12 at 20:51
  • 1
    Perhaps. The keyword 'function' is certainly a beacon that can help, and if he doesn't mind getting it wrong sometimes, that might be all he has to search for. If he wants to be more careful, he has have to worry about comments containing things that look like function heads, and strings (less likely but possible in theory), as well as nested function heads (presumably he doesn't want these) and the issues of comments and line breaks stuck in various random places inside the function head. Simple regexps won't do the trick; he'll have to build most of lexer to make sure he isn't lost. – Ira Baxter May 29 '12 at 21:14
  • ... and if he is interested in *method* arguments he'll have to keep track of which class he is in, and whether the method is overloaded. Maybe his problem is defined to be simple. – Ira Baxter May 29 '12 at 21:22
0

Have you considered using map containers?

You can write your functions along these lines . . .

function [outMAP] = divide(inMAP)
     outMAP = containers.Map();
     outMAP('value') = floor(inMAP('left') / inMAP('right'));
     outMAP('remain') = inMAP('left') / inMAP('right') - outMAP('value');
end

...and call them like this ...

inMAP  = containers.Map({'left', 'right'}, {4, 5});
outMAP = divide(inMAP);

...and then simply examine tha variable names using the following syntax...

>> keys(inMAP)

ans = 

    'left'    'right'
learnvst
  • 15,455
  • 16
  • 74
  • 121
-3

inputname(argnum) http://www.mathworks.com/help/techdoc/ref/inputname.html .

mwengler
  • 2,738
  • 1
  • 19
  • 32
  • 5
    That's the actual parameter name (in the context of the caller), he seems to want just the formal parameter name. – Ben Voigt May 03 '12 at 13:51
  • 1
    @Ben Voigt Yes, you're right. I want to get the name of the parameter as written in the function definition, from outside the function. – Hannesh May 03 '12 at 14:07
  • 1
    Is there something like this but for output arguments? To get the names of the arguments from the caller? – tim Nov 29 '13 at 09:00