0

I have been reading Clean Code by Robert C. Martin and have a basic (but fundamental) question about functions and program structure.

The book emphasizes that functions should:

  1. be brief (like 10 lines, or less)
  2. do one, and only one, thing

I am a little unclear on how to apply this in practice. For example, I am developing a program to:

  1. load a baseline text file
  2. parse baseline text file
  3. load a test text file
  4. parse test text file
  5. compare parsed test with parsed baseline
  6. aggregate results

I have tried two approaches, but neither seem to meet Martin's criteria:

APPROACH 1

setup a Main function that centrally commands other functions in the workflow. But then main() can end up being very long (violates #1), and is obviously doing many things (violates #2). Something like this:

main()
{
    // manage steps, one at a time, from start to finish
    baseFile = loadFile("baseline.txt");
    parsedBaseline = parseFile(baseFile);
    testFile = loadFile("test.txt");
    parsedTest = parseFile(testFile);
    comparisonResults = compareFiles(parsedBaseline, parsedTest);
    aggregateResults(comparisonResults);
}

APPROACH 2

use Main to trigger a function "cascade". But each function is calling a dependency, so it still seems like they are doing more than one thing (violates #2?). For example, calling the aggregation function internally calls for results comparison. The flow also seems backwards, as it starts with the end goal and calls dependencies as it goes. Something like this:

main()
{
    // trigger end result, and let functions internally manage
    aggregateResults("baseline.txt", "comparison.txt");
}

aggregateResults(baseFile, testFile)
{
    comparisonResults = compareFiles(baseFile, testFile);   

    // aggregate results here
    return theAggregatedResult;
}

compareFiles(baseFile, testFile)
{
    parsedBase = parseFile(baseFile);
    parsedTest = parseFile(testFile);

    // compare parsed files here        
    return theFileComparison;
}

parseFile(filename)
{
    loadFile(filename);

    // parse the file here
    return theParsedFile;
}

loadFile(filename)
{
    //load the file here
    return theLoadedFile;
}

Obviously functions need to call one another. So what is the right way to structure a program to meet Martin's criteria, please?

Roberto
  • 2,054
  • 4
  • 31
  • 46

1 Answers1

1

I think you are interpreting rule 2 wrong by not taking context into account. The main() function only does one thing and that is everything, i.e. running the whole program. Let's say you have a convert_abc_file_xyz_file(source_filename, target_filename) then this function should only do the one thing its name (and arguments) implies: converting a file of format abc into one of format xyz. Of course on a lower level there are many things to be done to achieve this. For instancereading the source file (read_abc_file(…)), converting the data from format abc into format xyz (convert_abc_to_xyz(…)), and then writing the converted data into a new file (write_xyz_file(…)).

The second approach is wrong as it becomes impossible to write functions that only do one thing because every functions does all the other things in the ”cascaded” calls. In the first approach it is possible to test or reuse single functions, i.e. just call read_abc_file() to read a file. If that function calls convert_abc_to_xyz() which in turn calls write_xyz_file() that is not possible anymore.

BlackJack
  • 4,476
  • 1
  • 20
  • 25
  • thanks @BlackJack. Regarding "Of course on a lower level there are many things to be done to achieve this". I think you are implying the original `main` should do all this before calling `convert` - is that right? – Roberto Jun 02 '15 at 16:09
  • @Roberto No the `convert_abc_file_to_xyz_file()` should ”do” all the things by calling the functions I've described. – BlackJack Jun 02 '15 at 16:53
  • I am sorry, but I am still not *getting* your answer. In my example, you mentioned Approach1 was good, because it allows reuse of `read_abc_file()` ("in the first aproach it is possible to test ... just call `read_abc_file()`"). But your follow up example puts `read_abc_file()` *inside* `convert()` ("should do all the things by calling the functions..."). How would we reuse / test `read_abc_file()` here? Isn't it "locked" inside `convert()`? I am sure the answer is obvious, but something fundamental is really just not connecting for me. – Roberto Jun 02 '15 at 18:33
  • @Roberto: I'm not putting the function for reading the file inside the `convert()` function but a _call_ to that function. This means the function can also be called outside the `convert()` function to read that file type for other purposes than converting the file format or to test it with different input files. – BlackJack Jun 03 '15 at 12:15
  • so why is this not "cascading" (one function calls the next), which you said was bad? Approach 2 also has separate functions shared only by calls? I guess I need to reread Clean Code, because something is just not clicking for me. Thanks again for all your help. – Roberto Jun 03 '15 at 13:05
  • @Roberto: You can't test or use the functions in approach 2 separately because it always means calling all the cascaded functions. You can't aggregate results that are not loaded, parsed, and compared with hard coded functions. You can't compare two sets of data which are not loaded and parsed with those hard coded functions, you can't parse data without using the hard coded load function. Think about how you would implement giving the user a choice to load a file or get the data from a database, and/or using different file formats and therefore need alternative parse functions. – BlackJack Jun 03 '15 at 15:09