8

Title says it all really.

It can sometimes end up that development and IT are at loggerheads over this sort of thing. What level of documentation do you expect when you're expected to install, patch, maintain, start, stop and diagnose a solution running across one or more servers?

HopelessN00b
  • 53,795
  • 33
  • 135
  • 209
cletus
  • 9,999
  • 9
  • 37
  • 40

7 Answers7

9

All of these things should be documented in detail, although when the operation is standard for the operating system, application server, web server etc you may be able to assume the IT operations people know how to do that.

Installation: document everything about how it is installed and configured, including how to tell if it is operating correctly.

Tell us about the architecture, especially about the communication between various solution components (e.g. range of ports - RPC mechanisns often use a range of ports - we need to know what the range is and when the application might run out of ports).

Patching: document anything specific to the application - what needs to be shut down before patching, and any follow up actions after patching (caches, indexes, proxies that may need to be cleared or rebuilt).

Maintainance: document what normal and abnormal operation looks like - what queues and other things should be monitored and what the normal range of these is.

Tell us how to manage the data - especially tables and files that grow without limit (e.g. log files and transaction histories). How should these be purged and what's the impact of removing old entries? (on reporting etc).

Tell us how to carry out standard "business as usual"/in-life management actions - this might be adding or modifying user accounts, for example.

Tell us about any other regular management actions that might be required (e.g. which certificates are used and what to do when they expire).

For all changes tell us how to roll them back (not all changes are successful). And tell us that you've tested the rollback plans!

Diagnosis: Document log file formats and locations and EVERY application error message that might turn up, saying what the error message means has gone wrong and what might need to be changed to fix it. Never use the same error message for two different events.

Shot down and start up: How, what order, any special procedures (e.g. letting servers drain connections before shutting them down).

I strongly disagree that the best way of doing this is to throw the application over the fence and let the IT people work out what is needed. The operational documentation (and in general, the manageability features of the application) need to be thought about up front.

  • 1
    Wow, this level of *knowledge* about the system before deployment, nevermind documentation, would be amazing. Isn't this why some companies employ SREs with devs rather than rely on developers to think like this? –  May 04 '09 at 23:26
  • It's true that most developers don't think about such things (I've worked both as a software developer and later as an architect in a infrastructure management company, and the latter was an eye opener...). I think developers should know about these topics, but if they don't, then maybe specialists working along side is the way forward. This is really part of a wider issue about what is important with software - the value is the software being executed and available - providing a service, not just feature-complete. I may need to ask another questions so I can answer that in more depth :) – The Archetypal Paul May 05 '09 at 09:17
2

My list of requirements for documentation would be (not in any specific order):

(documentation on:)

  • all command line switches
  • all exit states and return values
  • log messages (not so much the content but rather explaining fields if it is not configurable)
  • configuration syntax
  • switches in the config files
  • memory usage
  • is it threaded or forked
  • what are the signals the server reacts on
    • are there any signal that don't restart the server but make it re-read the config
    • how does it behave? (does it wait for existing threads/processes to finish with the old config. Does it kill them, ...)
  • what happens on unclean shutdown (especially if it is some kind of persistence service/server)
  • does it log thru system provided calls or does it log with something written by itself (yuck for apache and access log - I clearly prefer on-board tools for logging)
  • IPv4 and IPv6 ready if it's a network service
  • documentation on trunk and documentation on a specific version
    • nothing is as bad as configuring something for hours just to find out it will be ignored because the config option is only available in trunk
  • which config option is valid in which version (available since: v1.0, deprecated since: v1.2 or something alike)

Documentation like this are examples for good documentation:

I'd consider documentation like this to be full of fail:

Also the FreeBSD Handbook is a great example of documentation, and OpenBSD's approach. They kick stuff out that isn't properly documented.

EDIT: this list is by no means complete it is just the basic stuff that immediately came to my mind. Also the documentation should be well readable, not just something that reads like someone threw up.

serverhorror
  • 6,478
  • 2
  • 25
  • 42
2

A follow-on question would be: what happens when (not if) the developers don't supply sufficient documentation?

I recommend that IT have the ability to enter defect reports against the software, using whatever defect tracking system the developers use. That way, if they didn't tell you, for instance, that the files in a particular folder need to be purged, and that only a week's worth should be kept, you could enter a defect saying "application fills the disk with log files", and suggest they work with IT on a documented technique for purging that folder.

John Saunders
  • 425
  • 7
  • 22
  • Yep, been there, done that. It took **four weeks** for the developers to tell us how to purge the three tables that were growing without limit. Quicker to have thought about that upfront. But I strongly agree with you that manageability issues are defects in the software... – The Archetypal Paul May 10 '09 at 20:12
  • I usually reject deploying servers (as in daemons) that are undocumented. If I really need to deploy them by force (management demands it) I clearly state how much it will cost to figure all the stuff out – serverhorror Jun 15 '09 at 16:35
1

In short, I expect the documentation I specify and contract for.

Too many times this critical detail is left out of an agreement. The end user expects it and wants it for free of course. Good developers will correct this oversight early in process and set expectations including a price and time requirement.

Jim C
  • 409
  • 3
  • 3
0

@Spoike (I can't comment on answers yet..)

IT implementors (the role will vary by firm type and size) must work consistently to achieve the following:

  • Install/turnover Minimum Requirements - in other words, IT cannot be passive and expect developers to "know" what information is needed at install/turnover time. I have found that there is often considerable confusion/disagreement in IT as to what constitutes proper documentation of an app. Dev understands requirements (we hope) and IT must caucus to find what - at a minimum - is required.

  • An install/turnover procedure - in enterprise settings you might call this Change Control or Governance, but it is essentially a standard review cycle wherein IT sits down with Dev PRIOR top install to get a briefing on the product and its needs.

Installing an app is not unlike debuting a theatrical production. Before the curtain goes up, the director (lead developer) meets repeatedly with the stage production team (IT implementors) to make sure everything is "just so" for opening night (the public install).

You cannot change the Dev persona (why would you want to?), but you can point to your shared goal of a fantastic app that runs blazingly fast for all users. Your consensus IT doc requirements are just one of the things needed to ensure that.

Netais LLC
  • 131
  • 3
0

I believe IT needs to communicate with the developers what kind of documentation is needed. Best way to do this is if development delivers pre-release versions (or iteration releases) of a solution for IT to play and test with so IT can respond with what's needed.

Spoike
  • 369
  • 4
  • 13
0

Creating adequate release notes with an application would be a good start. If there are changes to current behaviour with the release, any notes from QA about changes to dependencies or start/stop behaviours, changes in load to dependent servers or databases, etc.