21

Looking to improve quality of a fairly large Python project. I am happy with the types of warnings PyLint gives me. However, they are just too numerous and hard to enforce across a large organization. Also I believe that some code is more critical/sensitive than others with respect to where the next bug may come. For example I would like to spend more time validating a library method that is used by 100 modules rather than a script that was last touched 2 years ago and may not be used in production. Also it would be interesting to know modules that are frequently updated.

Is anyone familiar with tools for Python or otherwise that help with this type of analysis?

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Kozyarchuk
  • 21,049
  • 14
  • 40
  • 46

6 Answers6

15

You problem is similar to the one I answered over at SQA https://sqa.stackexchange.com/a/3082. This problem was associated with Java which made the tooling a bit easier, but I have a number of suggestions below.

A number of other answers suggest that there is no good runtime tools for Python. I disagree on this in several ways:

  1. Coverage tools work very well
  2. Based on my experience in tooling in Java, static and dynamic analysis tools in Python are weaker than in a strongly typed less dynamic language but will work more than well enough to give good heuristics for you here. Unless you use an unusually large pathological number of dynamic features (including adding and removing methods, intercepting method and property invocations, playing with import, manually modifying the namespace) - in which case any problems you have may well be associated with this dynamism...
  3. Pylint picks up simpler problems, and will not detect problems with dynamic class/instance modifications and decorators - so it doesn't matter that the metric tools don't measure these
  4. In any case, where you can usefully focus can be determined by much more than a dependency graph.

Heuristics for selecting code

I find that there are a number of different considerations for selecting code for improvement which work both individually and together. Remember that, at first, all you need to do is find a productive seam of work - you don't need to find the absolutely worst code before you start.

Use your judgement.

After a few cycles through the codebase, you will have a huge amount of information and be much better positioned to continue your work - if indeed more needs to be done.

That said, here are my suggestions:

High value to the business: For example any code that could cost your company a lot of money. Many of these may be obvious or widely known (because they are important), or they may be detected by running the important use cases on a system with the run-time profiler enabled. I use Coverage.

Static code metrics: There are a lot of metrics, but the ones that concern us are:

Note that these tools are file-based. This is probably fine-enough resolution since you mention the project is itself has hundreds of modules (files).

Changes frequently: Code that changes frequently is highly suspect. The code may:

  • Historically have had many defects, and empirically may continue to do so
  • Be undergoing changes from feature development (high number of revisions in your VCS)

Find areas of change using a VCS visualisation tool such as those discussed later in this answer.

Uncovered code: Code not covered by tests.

If you run (or can run) your unit tests, your other automated tests and typical user tests with coverage, take a look at the packages and files with next to no coverage. There are two logical reasons why there is no coverage:

  • The code is needed (and important) but not tested at all (at least automatically). These areas are extremely high risk
  • The code may be unused and is a candidate for removal.

Ask other developers

You may be surprised at the 'smell' metrics you can gather by having a coffee with the longer-serving developers. I bet they will be very happy if someone cleans up a dirty area of the codebase where only the bravest souls will venture.

Visibility - detecting changes over time

I am assuming that your environment has a DVCS (such as Git or Mercurial) or at least a VCS (eg SVN). I hope that you are also using an issue or bug tracker of some kind. If so, there is a huge amount of information available. It's even better if developers have reliably checked in with comments and issue numbers. But how do you visualise it and use it?

While you can tackle the problem on a single desktop, it is probably a good idea to set up a Continuous Integration (CI) environment, perhaps using a tool like Jenkins. To keep the answer short, I will assume Jenkins from now on. Jenkins comes with a large number of plugins that really help with code analysis. I use:

This gives me visibility of changes over time, and I can drill in from there. For example, suppose PyLint violations start increasing in a module - I have evidence of the increase, and I know the package or file in which this is occurring, so I can find out who's involved and go speak with them.

If you need historic data and you have just installed Jenkins, see if you can run a few manual builds that start at the beginning of the project and take a series of jumps forward in time until the present. You can choose milestone release tags (or dates) from the VCS.

Another important area, as mentioned above, is detecting the loci of changes in the code base. I have really liked Atlassian Fisheye for this. Besides being really good at searching for commit messages (eg bug id) or file contents at any point in time, it allows me to easily see metrics:

  • Linecount by directory and subdirectory
  • Committers at any point in time or in specific directories and/or files
  • Patterns of committal, both by time and also location in the source code
Community
  • 1
  • 1
Andrew Alcock
  • 19,401
  • 4
  • 42
  • 60
10

I'm afraid you are mostly on your own.

If you have decent set of tests, look at code coverage and dead code.

If you have a decent profiling setup, use that to get a glimpse of what's used more.

In the end, it seems you are more interested in fan-in/fan-out analysis, I'm not aware of any good tools for Python, primarily because static analysis is horribly unreliable against a dynamic language, and so far I didn't see any statistical analysis tools.

I reckon that this information is sort of available in JIT compilers -- whatever (function, argument types) is in cache (compiled) those are used the most. Whether or not you can get this data out of e.g. PyPy I really don't have a clue.

Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
  • @JonasWielicki: i just see using where is the `ab` in `abusing`? – ted Oct 19 '12 at 16:09
  • @ted, from my understanding profilers are usually used/intended for performance analysis. might not be getting your joke too. – Jonas Schäfer Oct 20 '12 at 10:42
  • 1
    @JonasWielicki: I consider profiling also as the tool of choice for **hotspot analysis**. We have profiling and tracing. And the op only cares about the call count of the code not the order, so I consider profiling the right tool. – ted Oct 20 '12 at 12:43
  • obviously, it is the right tool, because it does what you need in that case. I just never thought using profilers for that purpose before. – Jonas Schäfer Oct 20 '12 at 12:44
5

Source control tools can give a good indication of frequently updated modules - often indicating trouble spots.

If you don't have source control but the project is run from a shared location delete all the pycache folders or .pyc files. Over time/under use watch which files get recreated to indicate their use.

Analysing the Python imports printed when running from particular entry points with

python -v entry_point

may give some insight into which modules are being used. Although if you have known access points you should try the coverage module.

For a more intrusive solution, consider setting up project wide logging. You can log metrics easy enough, even over distributed programs.

Hardbyte
  • 1,467
  • 13
  • 25
5

I agree with the others, in that I have yet to come across a good runtime analysis tool for Python that will do this. There are some ways to deal with it, but none are trivial.

The most robust, I think, would be to get the Python source and recompile the binaries with some sort of built-in runtime logging. That way you could just slide it into the existing environment without any code changes to your project. Of course, that isn't exactly trivial to do, but it has the bonus that you might some day be able to get that merged back into the trunk for future generations and what not.

For non-recompile approaches, the first place I would look is the profile library's deterministic profiling section.

How you implement it will be heavily dependent on how your environment is set up. Do you have many individual scripts and projects run independently of one another, or just the one main script or module or package that gets used by everybody else, and you just want to know what parts of it can be trimmed out to make maintenance easier? Is it a load once, run forever kind of set up, or a situation where you just run scripts atomically on some sort of schedule?

You could implement project-wide logging (as mentioned in @Hardbyte's answer), but that would require going through the project and adding the logging lines to all of your code. If you do that, you may as well just do it using the built-in profiler, I think.

Nisan.H
  • 6,032
  • 2
  • 26
  • 26
2

Have a look at sys.setprofile: it allows you to install a profiler function.

Its usage is detailed in http://docs.python.org/library/profile.html#profile, for a jumpstart go here.

If you can not profile your application you will be bound to the cooverage approach.

Another thing you might have a look at is decorators, you can write a debugging decorator, and apply it to set of functions you suspect. Take alook here to see how to apply the decorator to an entire module.

You might also take a look at python call graph, while it will not generate quite what you want it shows you how often one function calls another:

If your code runs on user input this will be hard, since you would have to simulate 'typical' usage.

There is not more to tell you, just remember profiling as keyword.

ted
  • 4,791
  • 5
  • 38
  • 84
1

Pylint sometimes gives warnings that (after careful consideration) are not justified. In which case it is useful to make use of the special #pylint: disable=X0123 comments (where X0123 is the actual error/warning message number) if the code cannot be refactored to not trigger the warning.

I'd like to second Hardbyte's mention of using your source control logs to see which files are most often changed.

If you are working on a system that has find, grep and sort installed, the following is a way to check which file imports what;

find . -name '*.py' -exec grep -EH '^import|^from .* import' {} \+| sort |less

To find the most popular imports across all files;

find . -name '*.py' -exec grep -Eh '^import|^from .* import' {} \+ | sort | less

These two commands should help you find the most-used modules from your project.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94