19

After reading this very informative (albeit somewhat argumentative) question I would like to know your experience with programming large projects with Python. Do things become un manageable as the project becomes larger? This concern is one thing that keeps me attached to Java. I would therefore be particularly interested in informed comparisons of maintainability and extensibility of Java and Python for large projects.

Community
  • 1
  • 1
JnBrymn
  • 24,245
  • 28
  • 105
  • 147
  • 3
    This is the kind of question that always confuses me. How can the typing system affect maintainability? There are two possibilities - either you can trust the people checking into your source base or you can't. In the former case, you don't have any problems, regardless of what language, system, frameworks, etc. you are using. If you can't trust them, there is no hope for you regardless of what language, system, frameworks, etc. you are using. I certainly don't see how as small a piece of the pie as the typing system can make any difference as to the overall maintainability of a project. – Carl Norum Sep 08 '10 at 21:06
  • 3
    This looks like a good candidate for CW. – nmichaels Sep 08 '10 at 21:07
  • I am no expert on Python nor Java, for me maintainability is mostly depends on on design. – THEn Sep 08 '10 at 22:00
  • 2
    @THEn: Good luck trying to maintain some Visual Basic or Perl or shell script... I guess this is why you said _mostly_. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Sep 08 '10 at 22:38
  • @Longpoke: You got me. :) That is actually what I do. Even worse I have to add VBA... :) – THEn Sep 08 '10 at 23:34
  • @CarlNorum it may be possible to make an argument of static types as a medium for communication. I'm not arguing that, but I think it is not necessary to assume that the maintainability question must be centered on *safety*. Perhaps there are clarity benefits of static typing...? – Keith Pinson Feb 24 '14 at 20:52
  • I have not read it yet, but [this paper](http://pleiad.dcc.uchile.cl/papers/2012/kleinschmagerAl-icpc2012.pdf) may be of interest. – Keith Pinson Feb 24 '14 at 20:55

8 Answers8

15

I work on a large scale commercial product done in Python. I give a very rough estimate of 5000 files x 500 lines each. That's about 2.5 millions lines of Python. Mind you the complexity of this project is probably equivalent to 10 mil+ lines of code in other languages. I've not heard from a single engineer/architecture/manager who complain about Python code being unmaintainable. From what I've seen from our bug tracker, I do not see any systemic problem that could be avoided by static type checking. In fact there is very few bugs spawn from incorrect use of object type at all.

I think this is a very good academic subject to empirically study why static class based language does not seems to be as critical as one might think.

And about extensibility. We just added a database 2 on top of the database 1 in our product, both of them non-SQL. There is no issue related to type checking. First of all we have designed an API flexible enough to anticipate different underlying implementation. I think dynamic language is a helps rather than hindrance in this regard. When we went on to testing and bug fixing phrase, we were working on the kind of bugs people working on any language would have to face. For example, memory usage issues, consistence and referential integrity issues, error handling issues. I don't see static type checking have much help on any of these challenges. On the other hand we have benefited greatly from dynamic language by being able to inject code mid-flight or after simple patching. And we are able to test our hypothesis and demonstrate our fixes quickly.

It is safe to say most of our 100+ engineers are happy and productive using Python. It is probably unthinkable for us to build the same product using a static typed language in the same amount of time with the same quality.

Wai Yip Tung
  • 18,106
  • 10
  • 43
  • 47
9

Strongly typed languages tend to produce fewer bugs overall, and type hints where added to python 3.11 - similar to typescript.

Weakly typed languages are fine for rapid development, and often times being able to rapidly experiment with a new features is more important.

There isn't one solution here - pick the tool that is right for the job.

rook
  • 66,304
  • 38
  • 162
  • 239
  • 2
    I'm working on _the worse_ Java codebase ever created in the history of man kind at the moment. It is littered with unnecessary global state, do-nothing statements (including do nothing anonymous class instantiations), unnecessary custom class loaders, littering the filesystem with random garbage, race conditions, who knows what else, yet it is still relatively easy to work with... Why? because it's a simple language, although Python is too... The whole purpose of static typing is to make everything **static**; i.e: **easy to analyze**. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Sep 08 '10 at 22:43
  • 5
    Your naming convention example doesn't make any sense: 1. In both Python and Java, you _never_ pass the name of a class to a function unless you're asking for trouble, or have a _very_ specific use-case. 2. Contrary to reflection-free Java code, Python code _cannot_ be deterministically refactored to use new method names due to dynamic typing. 3. Pythonic code follows PEP8 naming conventions, most Java follows theConventionLikeThis, why would you ever use your own convention when you already have to intermix De facto libraries which all use the De facto conventions? – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Sep 08 '10 at 22:50
  • Also, you might wanna try Haskell before you say static typing "gets in your way"... Perhaps your impression comes from the dumb (yet simple and sound) way Java implemented it. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Sep 08 '10 at 22:55
  • 1
    @Longpoke: His point about naming is that you need the typename in the function parameter. If you have a function `doStuff(CustomClass c)` and you decide you need to rename `CustomClass`, you just hosed all your methods that take that as a parameter. Although this is a very minimal problem since most IDEs allow you to just refactor it. And python doesn't get rid of this since you still have to instantiate objects and those calls will have to change too – Falmarri Sep 08 '10 at 23:24
  • @Falmarri: Oops I misread, I thought he meant a function that takes a _name of a class_, rather than just a _class_. Agreed. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Sep 09 '10 at 00:07
  • @Longpoke I don't think there is a lower bound on bad logic, it doesn't matter what language it is expressed in. – rook Sep 09 '10 at 01:56
  • Haskell's type system doesn't increase *EVERYONES* productivity. Using type inference means difficult to understand errors, so people often use dynamic/-fdefer-types to use dynamic typing in Haskell to get understandable type errors. When someone tried porting Rails to Haskell they got a lot of it working, but then many functions had to be rewritten using Monads because they forgot they needed to thread state through the stack. Monad Transformers are pita. You don't see big Haskell programs often outside of finance, because static typing is NOT a productivity silver bullet. – aoeu256 Oct 13 '19 at 21:17
6

Try tracing back the source of an apparently malformed object in a large, dynamically-typed framework with lots of IoC or other design patterns where the object cannot be traced directly up the stack.

Now try doing this in a statically-typed language.

Unless the type of the object is documented close to the use-site (e.g. via type annotations, a-la Python's typesafe library) or somewhere on the stack, deducing where it came from can be virtually impossible. I speak from experience, having tried to debug parts of the BuildBot framework. It involved an immense amount of raw text searching through the framework, even using fancy IDEs such as PyDev, Komodo and Wingware.

I do not doubt that it is possible to impose some type constraints on dynamic languages, but the lack of any standardisation on this seems to be an impediment to anyone trying to debug part of a large, existent framework.

EDIT: since 2014, Guido added PEP484, MyPy and the typing module. This has made my experience much, much better in terms of maintaining large projects.

  • 1
    Another thing you can do is add annotations to objects about their history (where they come from). – aoeu256 Oct 13 '19 at 20:57
5

A large code base in python without good test coverage might be an issue. But thats just one part of the image. It's all about people and suitable approaches to do the job.

Without

  • Source Control
  • Bug Tracking
  • Unit Tests
  • Committed Team

you might fail with any kind of language.

zellus
  • 9,617
  • 5
  • 39
  • 56
  • 5
    My experience has been that a Python codebase is less forgiving of new developers who've been tasked with maintaining the code after the original developers have already moved on. The particular codebase I encountered did not have any unit tests, which meant that all mistakes were only caught in integration tests or (as too often happened) in the field. Statically typed languages can catch some of the stupider mistakes you can make, but it's certainly not a magic bullet. – Dan Bryant Sep 09 '10 at 00:59
  • Statically typed languages give you a more detailed method signature. If you are dealing with a code base that is poorly documented and not very self-documenting, it can be a huge help. Unfortunately, good dynamically typed languages (I'm thinking of Python here) make prototyping almost too easy - you can get a program up and running and pretty close to complete so quickly that properly naming and documenting comes to seem like an insurmountable chore. Developers have to learn the importance of *choosing good names* before static typing loses much of its advantage in maintainability. – outis nihil Oct 03 '14 at 15:39
  • For undocumented Python code, you can use sys.settrace or add a logging decorator to every function on sys.modules to log arguments and return values, then add a editor "plugin" so that you view the sample call log above the function definition. Another idea is to edit your application while its still running, i.e: put a breakpoint inside a function call. Both require some sort of code coverage though. – aoeu256 Oct 13 '19 at 21:03
4

I remember the days before and after the innovation of IntelliJ IDEA. There are huge differences. Before, static typing was only for compilation, development basically treats source code as text files. After, source code is structured information, many development tasks are must easier, thanks to static typing.

However, it's not like the old days were living hell. We took it as is, do whatever necessary, use the tools available to date, get the system built, satisfaction. There weren't too many unhappy memories. That's probably what dynamic typing programmers feel now. It's not that bad.

Of course, I'll never go back to the old days. If I'm forbidden to use such an IDE, I guess I'll give us programming all together.

irreputable
  • 44,725
  • 9
  • 65
  • 93
2

In my experience, maintainability depends on low coupling, good documentation, good development process, and excellent testing. Static typing has very little to do with any of this.

The errors that Java will catch at compile time are only a small subset of the errors that can occur. They're also almost always the most trivial to detect by testing; there's no way you can miss calling a method on an object of the wrong class if you're testing that your code produces the right answer! In that respect you could argue that Python actually is better for ensuring quality; by forcing you to test at least a bit to ensure your code is free of simple typos, it ensures that you actually do test at least a bit.

In fact Java is not even a very good example of a language with strong static checks for catching lots of bugs. Try programming in Haskell or Mercury to see what I mean, or better yet try programming in Scala and interfacing with Java libraries; the difference in how much "correctness" the compiler is able to guarantee for you is striking when you compare the normal idiomatic Scala code using Scala libraries to the code that has to deal with Java libraries (I have actually done this, since I program a bit in Scala on Android).

Your ability to write good maintainable code in large code-bases worked on by many developers over long periods of time, despite the shortcomings of Java's static error detection compared to languages like Scala, depends on exactly the same techniques Python programmers use to do the same thing in their large code-bases, despite the shortcomings of Python's static error detection compared to Java.

Ben
  • 68,572
  • 20
  • 126
  • 174
0

I've used Python for many projects, from a few hundred lines to several thousand lines. Dynamic typing is a great time saver and it makes OO concepts like polymorphism way easier to use. The type system does not make projects unmaintainable. If you have trouble imagining that, try writing a few things in Python and see how they go.

nmichaels
  • 49,466
  • 12
  • 107
  • 135
  • 4
    The problem under discussion is not writing a new program *ex nihilo* - which, yes, is much easier in a dynamically typed language like Python; the problem is maintainability, and particularly wrapping your head around someone else's code. Because the method signatures of a dynamically typed language don't inherently carry any type information, you lose a built-in self-documentation mechanism. – outis nihil Oct 03 '14 at 15:43
  • 1
    @outisnihil My point was not just that writing new code in Python was easy, but that in my experience Python code is quite maintainable. I guess I didn't say it explicitly, but not all the projects for which I've used Python have been solo. This is more of a testimonial than a deep answer, but I thought it might add value to the conversation. – nmichaels Oct 20 '14 at 21:40
  • 2
    For me, at least, Python makes understanding the algorithm within each method easier, but (like all dynamically typed languages) understanding how to use existing methods slightly harder (without documentation). – outis nihil Oct 21 '14 at 16:44
  • It's a great time saver for a solo dev who has a mental map in their head of what the program should do. Coming into a new codebase I cannot stand python because I have no idea what the signature of functions are. What types can it accept? What types can it return? I must look at its implementation to deduce that. I'd rather the compiler enforce constraints for me before I even run the program, with all the free intellisense that comes from that, so I can be certain of things, as opposed to relying on some comments for the function which may not even be accurate, resulting in runtime errors. – user4779 Jul 16 '23 at 03:30
0

I am working in a big data start-up company which use python as main language. My projects is about 30k lines of python. from my experience, if your team adopts a good programming practice, fox example, adding type hints and extensively unit testing, it maybe not so affect maintainability. since Pycharm can automatically detect type some type errors if there are type hints.

the real issues are: 1. performance, this may not related to maintainability, but it is an issue. 2. not every python code base you handled are well written. since python is easy to learn and code. some people who do not have some basic CS training would develop a python project which is impossible to maintain. i worked a python project which has a lot files that has several thousand lines each without type hints. and that guy do not know about OOP. he basically write python in a way like writing C. he just utilize some python language features but completely imperative programming. Good written python projects really rely on well trained engineers. if you can not hire good-enough engineer, it would better to rely on tools itself. 3. in a python big data company, product managers and some non-technical people do not care data type in consistence. those people would design a data product which are not type-safe. for example, in a Json if a field which is usually a str, but when it is empty, some people would make it null. this would fail at Runtime.