3

I work on a multi-language software codebase (python, JS, java, PHP, C) where my previous colleagues commented everything. However, the vast majority of comments is completely unuseful. Ex :

/**
 * Get warning file info
 */
function getWarningFileInfos() {
   ...
}

/**
 * Compute the speed
 */
function computeSpeed() {
    ...
}

I want to setup linter rules to make sure such comments are not written again. Do you know linters that have such feature, or in which this feature can be added easily ? (the best would be linters compatible with non-english language comments)

Kiruahxh
  • 1,276
  • 13
  • 30

3 Answers3

5

This is an issue that requires teaching to your colleagues what comments are for and what kind of comments should be written.

If you just block comments with the same name as the method automatically, you will end up with slight variations:

/**
 * Get warning file info
 */
function getWarningFileInfos() {
   ...
}

becomes:

/**
 * Get the warning file info
 */
function getWarningFileInfos() {
   ...
}

... and the linter rule will accept it. This isn't really an issue that will get resolved with a linting rule.

If you are in position to demand proper comments from your colleagues, asking them to rewrite the comments properly is a great exercise to teach them what comments should be written.

There is no linter that can transform a useless comment into an useful one.

If you just want to delete all the lousy comments, you can do so with regex:

/\*.{1,50}\*/ will find all comments shorter than 50 characters (the editor must support the regex setting ". matches newline").

Replace with nothing and manually go through the files to check you aren't deleting anything valuable. The assumption is that most of these stupid comments are very short. 50 chars is arbitrary, change it to whatever works best for you.

Sylverdrag
  • 8,898
  • 5
  • 37
  • 54
3

There is no such linters on GitHub, So you have to write one by yourself.

The PHP_CodeSniffer is best known and powerful linter in the PHP world, So here is quick and dirty implementation based on PHP_CodeSniffer 3.5.2 for your purpose, It uses similar_text function to compare function name and comment string, If the percentage is more than 60%, It will mark this function as error with message "Contains unusual comment".

File src/Standards/StackOverflow/Sniffs/Commenting/UnusualCommentSniff.php

<?php

namespace PHP_CodeSniffer\Standards\StackOverflow\Sniffs\Commenting;

use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Sniffs\Sniff;
use PHP_CodeSniffer\Util\Tokens;

class UnusualCommentSniff implements Sniff
{
    public function register()
    {
        return [T_FUNCTION];
    }

    public function process(File $phpcsFile, $stackPtr)
    {
        $tokens = $phpcsFile->getTokens();
        $find   = Tokens::$methodPrefixes;
        $find[] = T_WHITESPACE;

        $commentEnd = $phpcsFile->findPrevious($find, ($stackPtr - 1), null, true);
        if ($tokens[$commentEnd]['code'] != T_DOC_COMMENT_CLOSE_TAG) {
            return;
        }
        $commentStart = $tokens[$commentEnd]['comment_opener'];

        $shortDesc = $this->findShortDescriptionInComment($phpcsFile, $commentStart, $commentEnd);
        $funcName = $phpcsFile->getDeclarationName($stackPtr);
        similar_text($funcName, $shortDesc, $percent);
        if ($percent > 60) {
            $error = 'Contains unusual comment';
            $fix = $phpcsFile->addFixableError($error, $stackPtr, 'code');
            if ($fix) {
                $fixer = $phpcsFile->fixer;
                for ($i = $commentStart; $i < $commentEnd + 1; $i++) {
                    $fixer->replaceToken($i, '');
                }
            }
        }
    }

    protected function findShortDescriptionInComment($phpcsFile, $commentStart, $commentEnd)
    {
        $tokens = $phpcsFile->getTokens();

        $empty = [
            T_DOC_COMMENT_WHITESPACE,
            T_DOC_COMMENT_STAR,
        ];

        $short = $phpcsFile->findNext($empty, $commentStart + 1, $commentEnd, true);
        if ($short === false) {
            return;
        }
        $shortContent = null;
        if ($tokens[$short]['code'] === T_DOC_COMMENT_STRING) {
            $shortContent = $tokens[$short]['content'];
            $shortEnd     = $short;
            for ($i = ($short + 1); $i < $commentEnd; $i++) {
                if ($tokens[$i]['code'] === T_DOC_COMMENT_STRING) {
                    if ($tokens[$i]['line'] === ($tokens[$shortEnd]['line'] + 1)) {
                        $shortContent .= $tokens[$i]['content'];
                        $shortEnd      = $i;
                    } else {
                        break;
                    }
                }
            }
        }

        return $shortContent;
    }
}

POC

Construct files and directories like this

$ pwd
/home/gasolwu/Code/PHP_CodeSniffer/src/Standards
$ git describe
3.5.2-89-g80ebd4a1a
$ tree -C StackOverflow/
StackOverflow/
├── ruleset.xml
└── Sniffs
    └── Commenting
        └── UnusualCommentSniff.php

2 directories, 2 files
$ cat StackOverflow/ruleset.xml
<?xml version="1.0"?>
<ruleset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" name="Zend" xsi:noNamespaceSchemaLocation="../../../phpcs.xsd">
</ruleset>

Prepare input file for test

$ cat input.php
<?php

class Foo
{

    /**
     * Get warning file info
     */
    private function getWarningFileInfos() {
    }

    /**
     * Compute the speed
     */
    public function computeSpeed() {
    }

    /**
     * This comment contains some useful information
     * So that should not be deleted
     */
    public function shoot() {
    }
}

Run with command phpcs

$ bin/phpcs --standard=Stackoverflow ./input.php

FILE: /data1/home/admin/gasolwu/Code/PHP_CodeSniffer/input.php
----------------------------------------------------------------------
FOUND 2 ERRORS AFFECTING 2 LINES
----------------------------------------------------------------------
  9 | ERROR | [x] Contains unusual comment
 15 | ERROR | [x] Contains unusual comment
----------------------------------------------------------------------
PHPCBF CAN FIX THE 2 MARKED SNIFF VIOLATIONS AUTOMATICALLY
----------------------------------------------------------------------

Time: 44ms; Memory: 3.75MB

Fixe this kind of errors automatically by command phpcbf

$ bin/phpcbf --standard=Stackoverflow ./input.php

PHPCBF RESULT SUMMARY
-----------------------------------------------------------------------------
FILE                                                         FIXED  REMAINING
-----------------------------------------------------------------------------
/data1/home/admin/gasolwu/Code/PHP_CodeSniffer/input.php     2      0
-----------------------------------------------------------------------------
A TOTAL OF 2 ERRORS WERE FIXED IN 1 FILE
-----------------------------------------------------------------------------

Time: 52ms; Memory: 3.75MB

$ cat input.php
<?php

class Foo
{


    private function getWarningFileInfos() {
    }


    public function computeSpeed() {
    }

    /**
     * This comment contains some useful information
     * So that should not be deleted
     */
    public function shoot() {
    }
}
Gasol
  • 2,319
  • 17
  • 27
  • 1
    +1 because it's nice, but unless you train the devs to write proper comments, you will get comments like "Calculate the speed" instead of "Compute the Speed", and the linter will accept those just fine. It doesn't solve the underlying issue which is that his colleagues don't know what to write in a comment. Also, the fix consists of deleting the comments, it doesn't generate proper comments, so the net result is "no comment" instead of "useless comments". – Sylverdrag Nov 25 '19 at 08:17
  • 1
    @Sylverdrag You're right, This methodology doesn't solve the problem at all. The best way I am thinking is consist of two approaches for that, First, Write some coding guideline and teach your colleagues. Second, Check it automatically on CI pipeline. – Gasol Nov 25 '19 at 08:26
  • Write some rules for your team, either linters or documentation. – Gasol Nov 25 '19 at 08:29
  • Thanks a lot ! I will test it as soon as possible. Of course the first step is to speak with my colleagues, we have planned a meeting about coding standards next week – Kiruahxh Nov 25 '19 at 08:35
1

If you really need it, you can use something crazy like this:

$c = '/**
 * Get warning file info
 */
function getWarningFileInfos() {
   ...
}

/**
 * Compute the speed
 */
function computeSpeed() {
    ...
}';

preg_match_all('~\*/[ \t]*\r?\n(?:[ \t]|\w)*function[ \t]+(\S+)[ \t]*\(~isu', $c, $matchesAll);
foreach ($matchesAll[1] as $n) {
    $words = preg_split('~(?=\p{Lu})~', $n);
    $regex = '~/\*(?:\s|\*)*' . implode('', array_map(function($w) {
        return '(?:(?:a|the)\s+)?' . preg_quote(preg_replace('~e?s$~is', '', $w), '~') . '(?:e?s)?' . '\s+';
    }, $words)) . '(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+' . preg_quote($n, '~') . '[ \t]*\()~isu';
    var_dump($regex);
    $c = preg_replace($regex, '', $c);
}

var_dump($c);

example output:

string(231) "~/\*(?:\s|\*)*(?:(?:a|the)\s+)?get(?:e?s)?\s+(?:(?:a|the)\s+)?Warning(?:e?s)?\s+(?:(?:a|the)\s+)?File(?:e?s)?\s+(?:(?:a|the)\s+)?Info(?:e?s)?\s+(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+getWarningFileInfos[ \t]*\()~isu"
string(162) "~/\*(?:\s|\*)*(?:(?:a|the)\s+)?compute(?:e?s)?\s+(?:(?:a|the)\s+)?Speed(?:e?s)?\s+(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+computeSpeed[ \t]*\()~isu"
string(88) "
function getWarningFileInfos() {
   ...
}


function computeSpeed() {
    ...
}"

There is probably no tools outside educating your programmers, does the code above helped you?

mvorisek
  • 3,290
  • 2
  • 18
  • 53