Testing Python Linters
Based on recommendations via comments on an earlier post, my March column is a survey of a few different "lint" programs for Python (it looks like I'll stick to PyChecker, pylint, and PyFlakes for now). I need some sample code to run through all 3 programs so I can compare the output reports. I have a few ideas for common "mistakes" to include, but I'm looking for other suggestions.
So, what kinds of things are these tools good at finding, and where do they need more work? Are there any false-positivies I should make sure to include?

6 comments:
We use PyLint to catch the following errors (it is capable of a whole lot more of course):
Unused imports
Shadowing builtins
Unused or unknown variables
Redefining functions/methods
Whitespace rules around operators
Probably more that I have forgotten. :-)
PyLint is zippy enough so that PyDev for Eclipse can spot problems in Python code, with just a 10 second lag, as you type.
(As in all thing in the Eclipse universe, your results may vary)
Zope 3 is a large code base that causes problems for any kind of Python checker I've tried. Often that's because Ubuntu packages the tool for Python 2.5 and I cannot easily use it on a codebase with extension modules compiled with Python 2.4.
I use a self-written ad-hoc unused import finder. Coding style violations are usually caught by post-commit checkin reviews. Actual bugs are caught by unit tests. The one thing I can think of that I'm missing is a (working) tool to discover redefined functions/classes/methods.
I tried pylint, pychecker and pyflakes. My experience is that pyflakes is the most usable.
Pyflakes does not import source files, but analyses the compiled syntax tree. It has a better success rate in reading my code. Some projects are not importable unless deployed in a specific directory, or started under a certain operating system which might not be the same as the development/checker machine.
Both pylint and pychecker check stuff I am not really interested in (like formatting or shadowing builtins). And I cannot really get them to be quiet without manually generating a long ignore list or using "grep -v".
Pyflakes has not as much checks as the other ones, but the stuff that is checked gets detected reliably (like unused or double imports, syntax errors, local shadowing etc.)
I can get to the point where I have zero errors with pyflakes, a thing that was not possible with pylint or pychecker. This enables me to run pyflakes regularly/nightly, or even as a SVN commit hook. Very nice.
Downside of all tools is the performance. Try them on >100 source files (and a previous poster mentioned 10 seconds processing time for eclipse/pylint after editing a line). This is not really good for interactive usage.
Oh, and tabnanny.py is another "linter" I use. And I like the name, who wouldn't want to have a nanny :)
I seem to remember pylint getting pretty confused by the mechanize module, claiming mechanize objects didn't have certain methods when they really did. I assume mechanize does some kind of dynamic object-creation that confused pylint, but I never looked into it.
I use pyflakes because it's easy, fast and the result is immediately usable.
The only problem is that the project seems dead, you have to checkout from SVN and apply patches found here and there to support new Python 2.5 constructs.
I like pyflakes because it doesn't load the code, just checking syntax, and fast.
All I want to know is wether I introduced syntax errors in my changes, haven't forgotten any import, nor left no longer used code.
It saves much time to detect these small errors upstream rather than during the exectution of the code.
And suppressing obsolete constructs helps keeping the code clean.
Post a Comment