1

Maybe this is kind of a pie-in-the-ski dream/question. Does anyone know of a tool that will produce XML or prefix-notation (lispy) -- that includes documentation comments from C# source code? It seems like this would be useful to a number of documentation generators, or static analysis tools, or whatever.

Admittedly this is a middle-ware kind of tool, and probably could be done via a compiler-generator like Antlr. However, someone might have scratched an itch and produced something along these lines... eh?

Edit: To Clarify: "that includes documentation comments from C# source code"

I add this to distinguish that an AST probably won't include comments (though Antlr has a 'channel' concept that can stream comments).

Edit: As far what to extract: Well basically the AST, with Comments, but in a form that is reusable, ie Lispy or XML would be fine. But it would have to be complete so that if one re-purpose doesn't need information another process could still benefit from its inclusion.

Hope that helps.

L-

lucidquiet
  • 6,124
  • 7
  • 51
  • 88
  • If you can do such source-to-source transformations, you already have a parser and an AST. Why not just use that in the tools you intend as consumer instead of having them go through such a representation? –  Oct 06 '11 at 19:55
  • 2
    Could you clarify what information you want to extract, perhaps by providing a simple example input and output? –  Oct 06 '11 at 20:01

3 Answers3

2

That itch was scratched long ago.

Our source-to-source program transformation tool, the DMS Software Reengineering Toolkit can do this for many languages, including C# using DMS's C# Front End.

See the example below, in which source code is parsed by DMS into an AST with (pre- and post-) comments decorating the various AST nodes, and then dumped to an output stream in "Lispy" syntax. There's a trivial XML variant, too, just to satisfy those folks that must have XML.

As a general rule, DMS users only use this to look at parse trees , because DMS provides a vast amount of machinery for analyzing, transforming and prettyprinting (regenerating text, including comments, from the AST).

It would be pretty trivial to use DMS to walk the AST and pick out only those comments which satisfied whatever criteria you wished, including "looks like a documentation comment". More importantly, you also get precise source location information, as well as access to the tree to which the comment is attached; you could easily build summaries of the methods from the method header subtree.

Most DMS users want to do more complicated things, which we claim is much easier to do in DMS using the all the machinery DMS provides, than exporting to some ad hoc tool in XML or Lisp format, where you have to re-invent all that machinery again.

For this code:

/* MyClass source file
   Contains MyClass and its methods
*/
class MyClass {

   static int count; // counts number of class instances

   MyClass() { count++; // bump instance count
     }

   /* First Method */
   get_count() { return count; }

}

DMS's C# parser produces this output (note embedded comments):

Domain Parser for CSharp~CSharp4_0 2.3.3
Copyright (C) Semantic Designs 1996-2010; All Rights Reserved
Parsing Time: 0.001728 seconds
(compilation_unit@CSharp~CSharp4_0=1#58e66c0^0 Line 4 Column 1 File C:/temp/test.cs
 (extern_alias_directives@CSharp~CSharp4_0=2#58e3340 Line 4 Column 1 File C:/temp/test.cs)extern_alias_directives
 (using_directives@CSharp~CSharp4_0=493#58e3380 Line 4 Column 1 File C:/temp/test.cs)using_directives
 (global_attributes@CSharp~CSharp4_0=1008#58e33c0 Line 4 Column 1 File C:/temp/test.cs)global_attributes
 (namespace_member_declarations@CSharp~CSharp4_0=506#58e6440 Line 4 Column 1 File C:/temp/test.cs
  (namespace_member_declarations@CSharp~CSharp4_0=505#58e3400 Line 4 Column 1 File C:/temp/test.cs)namespace_member_declarations
  (namespace_member_declaration@CSharp~CSharp4_0=516#58e3dc0 Line 4 Column 1 File C:/temp/test.cs
   (class_declaration@CSharp~CSharp4_0=533#58e6280 Line 4 Column 1 File C:/temp/test.cs
   |(class_header@CSharp~CSharp4_0=526#58e3600 Line 4 Column 1 File C:/temp/test.cs
   | precomment 4:1 `/* MyClass source file

   Contains MyClass and its methods

*/'
   | (attributes@CSharp~CSharp4_0=1023#58e3440 Line 4 Column 1 File C:/temp/test.cs)attributes
   | (class_modifiers@CSharp~CSharp4_0=534#58e3480 Line 4 Column 1 File C:/temp/test.cs)class_modifiers
   | (optional_partial@CSharp~CSharp4_0=524#58e34c0 Line 4 Column 1 File C:/temp/test.cs)optional_partial
   | (identifier@CSharp~CSharp4_0=1106#58e3580 Line 4 Column 7 File C:/temp/test.cs
   |  (IDENTIFIER@CSharp~CSharp4_0=1171#58e3500[`MyClass'] Line 4 Column 7 File C:/temp/test.cs)IDENTIFIER
   | )identifier
   | (class_base@CSharp~CSharp4_0=552#58e35c0 Line 4 Column 15 File C:/temp/test.cs)class_base
   |)class_header
   |(class_member_declarations@CSharp~CSharp4_0=563#58e3c80 {3} Line 6 Column 4 File C:/temp/test.cs
   | (class_member_declaration@CSharp~CSharp4_0=573#58e3ce0 Line 6 Column 4 File C:/temp/test.cs
   |  (field_declaration@CSharp~CSharp4_0=591#58e3e00 Line 6 Column 4 File C:/temp/test.cs
   |   postcomment 5:1 `// counts number of class instances'
   |   (attributes@CSharp~CSharp4_0=1023#58e3300 Line 6 Column 4 File C:/temp/test.cs)attributes
   |   (field_modifiers@CSharp~CSharp4_0=593#58e3aa0 {1} Line 6 Column 4 File C:/temp/test.cs
   |   |(field_modifier@CSharp~CSharp4_0=607#58e3840 Line 6 Column 4 File C:/temp/test.cs)field_modifier
   |   )field_modifiers
   |   (integral_type@CSharp~CSharp4_0=42#58e3540 Line 6 Column 11 File C:/temp/test.cs)integral_type
   |   (variable_declarators@CSharp~CSharp4_0=104#58e3780 Line 6 Column 15 File C:/temp/test.cs
   |   |(variable_declarator@CSharp~CSharp4_0=112#58e38a0 Line 6 Column 15 File C:/temp/test.cs
   |   | (identifier@CSharp~CSharp4_0=1106#58e3d00 Line 6 Column 15 File C:/temp/test.cs
   |   |  (IDENTIFIER@CSharp~CSharp4_0=1171#58e3460[`count'] Line 6 Column 15 File C:/temp/test.cs)IDENTIFIER
   |   | )identifier
   |   |)variable_declarator
   |   )variable_declarators
   |  )field_declaration
   | )class_member_declaration
   | (class_member_declaration@CSharp~CSharp4_0=579#58e6580 Line 8 Column 4 File C:/temp/test.cs
   |  (constructor_declaration@CSharp~CSharp4_0=798#58e61a0 Line 8 Column 4 File C:/temp/test.cs
   |   (constructor_header@CSharp~CSharp4_0=792#58e6720 Line 8 Column 4 File C:/temp/test.cs
   |   |(attributes@CSharp~CSharp4_0=1023#58e38c0 Line 8 Column 4 File C:/temp/test.cs)attributes
   |   |(constructor_modifiers@CSharp~CSharp4_0=800#58e3b40 Line 8 Column 4 File C:/temp/test.cs)constructor_modifiers
   |   |(constructor_declarator@CSharp~CSharp4_0=815#58e3760 Line 8 Column 4 File C:/temp/test.cs
   |   | (identifier@CSharp~CSharp4_0=1106#58e3d40 Line 8 Column 4 File C:/temp/test.cs
   |   |  (IDENTIFIER@CSharp~CSharp4_0=1171#58e3d80[`MyClass'] Line 8 Column 4 File C:/temp/test.cs)IDENTIFIER
   |   | )identifier
   |   | (optional_formal_parameter_list@CSharp~CSharp4_0=647#58e6700 Line 8 Column 12 File C:/temp/test.cs)optional_formal_parameter_list
   |   | (constructor_initializer@CSharp~CSharp4_0=816#58e37c0 Line 8 Column 14 File C:/temp/test.cs)constructor_initializer
   |   |)constructor_declarator
   |   )constructor_header
   |   (block@CSharp~CSharp4_0=405#58e6780 Line 8 Column 14 File C:/temp/test.cs
   |   |(statement_list@CSharp~CSharp4_0=406#58e3d20 Line 8 Column 16 File C:/temp/test.cs
   |   | (non_pp_embedded_statement@CSharp~CSharp4_0=367#58e39a0 Line 8 Column 16 File C:/temp/test.cs
   |   |  postcomment 2:1 `// bump instance count'
   |   |  (statement_expression@CSharp~CSharp4_0=433#58e6540 Line 8 Column 16 File C:/temp/test.cs
   |   |   (primary_no_array_creation_expression@CSharp~CSharp4_0=160#58e6740 Line 8 Column 16 File C:/temp/test.cs
   |   |   |(identifier@CSharp~CSharp4_0=1106#58e3800 Line 8 Column 16 File C:/temp/test.cs
   |   |   | (IDENTIFIER@CSharp~CSharp4_0=1171#58e3700[`count'] Line 8 Column 16 File C:/temp/test.cs)IDENTIFIER
   |   |   |)identifier
   |   |   )primary_no_array_creation_expression
   |   |  )statement_expression
   |   | )non_pp_embedded_statement
   |   |)statement_list
   |   )block
   |  )constructor_declaration
   | )class_member_declaration
   | (class_member_declaration@CSharp~CSharp4_0=579#58e6220 Line 12 Column 4 File C:/temp/test.cs
   |  (constructor_declaration@CSharp~CSharp4_0=798#58e3a80 Line 12 Column 4 File C:/temp/test.cs
   |   (constructor_header@CSharp~CSharp4_0=792#58e6380 Line 12 Column 4 File C:/temp/test.cs
   |   |(attributes@CSharp~CSharp4_0=1023#58e3a00 Line 12 Column 4 File C:/temp/test.cs)attributes
   |   |(constructor_modifiers@CSharp~CSharp4_0=800#58e38e0 Line 12 Column 4 File C:/temp/test.cs)constructor_modifiers
   |   |(constructor_declarator@CSharp~CSharp4_0=815#58e36e0 Line 12 Column 4 File C:/temp/test.cs
   |   | (identifier@CSharp~CSharp4_0=1106#58e3a20 Line 12 Column 4 File C:/temp/test.cs
   |   |  (IDENTIFIER@CSharp~CSharp4_0=1171#58e6760[`get_count'] Line 12 Column 4 File C:/temp/test.cs
   |   |   precomment 0:1 `/* First Method */')IDENTIFIER
   |   | )identifier
   |   | (optional_formal_parameter_list@CSharp~CSharp4_0=647#58e6960 Line 12 Column 14 File C:/temp/test.cs)optional_formal_parameter_list
   |   | (constructor_initializer@CSharp~CSharp4_0=816#58e65a0 Line 12 Column 16 File C:/temp/test.cs)constructor_initializer
   |   |)constructor_declarator
   |   )constructor_header
   |   (block@CSharp~CSharp4_0=405#58e6200 Line 12 Column 16 File C:/temp/test.cs
   |   |(statement_list@CSharp~CSharp4_0=406#58e6160 Line 12 Column 18 File C:/temp/test.cs
   |   | (non_pp_embedded_statement@CSharp~CSharp4_0=373#58e3e60 Line 12 Column 18 File C:/temp/test.cs
   |   |  (null_coalescing_expression@CSharp~CSharp4_0=327#58e3e40 Line 12 Column 25 File C:/temp/test.cs
   |   |   (conditional_or_expression@CSharp~CSharp4_0=325#58e3ec0 Line 12 Column 25 File C:/temp/test.cs
   |   |   |(conditional_and_expression@CSharp~CSharp4_0=323#58e6a00 Line 12 Column 25 File C:/temp/test.cs
   |   |   | (inclusive_or_expression@CSharp~CSharp4_0=321#58e3c00 Line 12 Column 25 File C:/temp/test.cs
   |   |   |  (exclusive_or_expression@CSharp~CSharp4_0=319#58e3b20 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   (and_expression@CSharp~CSharp4_0=317#58e35a0 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   |(equality_expression@CSharp~CSharp4_0=314#58e6520 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   | (additive_expression@CSharp~CSharp4_0=300#58e6140 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   |  (multiplicative_expression@CSharp~CSharp4_0=296#58e6460 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   |   (primary_no_array_creation_expression@CSharp~CSharp4_0=160#58e63a0 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   |   |(identifier@CSharp~CSharp4_0=1106#58e37e0 Line 12 Column 25 File C:/temp/test.cs
   |   |   |   |   | (IDENTIFIER@CSharp~CSharp4_0=1171#58e6560[`count'] Line 12 Column 25 File C:/temp/test.cs)IDENTIFIER
   |   |   |   |   |)identifier
   |   |   |   |   )primary_no_array_creation_expression
   |   |   |   |  )multiplicative_expression
   |   |   |   | )additive_expression
   |   |   |   |)equality_expression
   |   |   |   )and_expression
   |   |   |  )exclusive_or_expression
   |   |   | )inclusive_or_expression
   |   |   |)conditional_and_expression
   |   |   )conditional_or_expression
   |   |  )null_coalescing_expression
   |   | )non_pp_embedded_statement
   |   |)statement_list
   |   )block
   |  )constructor_declaration
   | )class_member_declaration
   |)class_member_declarations
   |(optional_semicolon@CSharp~CSharp4_0=959#58e3b60 Line 16 Column 1 File C:/temp/test.cs)optional_semicolon
   )class_declaration
  )namespace_member_declaration
 )namespace_member_declarations
)compilation_unit
Exiting with final status 0
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Awesomeness. Right now I'd be interested in ASTs for JavaScript and C# if possible. Which of the **many** tools you have does this for the above example? Is it runnable from the command line? I'd rather not write ad hoc tools, but the fact is that I want to typically put something in the build-pipeline, and well it works better if that tool isn't a huge IDE or something. – lucidquiet Oct 06 '11 at 21:59
  • DMS generates tools which are basically command-line driven, because most uses, even yours, tend to be as part of a larger build. The generated C# parser for DMS was invoked using this command line: "run DomainParser ++AST C:/temp/test.cs" to produce the output you see here. Yes, there's a JavaScript front end that works exactly the same way, with exactly the same comment capture capability. – Ira Baxter Oct 06 '11 at 22:20
1

Okay, now it's clearer that you want a full AST, there's nothing in the standard tools from MS yet. However, the C# compiler team are currently working on "Roslyn" aka "compiler as a service". Preview builds are due very soon - at which point it should become clearer whether or not this supports what you're after.

It's unknown at the moment whether Roslyn will be available just as part of the .NET framework or whether it'll only ship with some SKUs of Visual Studio - but it may well end up being more affordable than some other alternatives.


Original answer

This is already part of the language and normal tools - in Visual Studio, just go to Project Properties / Build / Output and enable the checkbox with "XML Documentation File". Pick the file to write the docs to, and away you go.

Building readable HTML from that is slightly trickier; Sandcastle will do this, but needs a helper project - Sandcastle Help File Builder to turn it into a more manageable task. It's fairly flexible in what it produces though - for an example, you could look at the API documentation for Noda Time which is generated with Sandcastle.

Additionally, if you're building a class library for others to use, if you ship the XML alongside the DLL and it has the same name (just with a .xml suffix instead of .dll), Visual Studio will use it to give your users tooltips when they use your types, methods etc.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • It seems this only extracts documentation/comments. It seems that OP is thinking about exporting a representation of the whole source code (see reference to static analysis tools). –  Oct 06 '11 at 19:56
  • @delnan: If that's the case, it's a very poorly worded question: "that includes documentation comments from C# source code" sounds like the XML comments to me. Even if such a thing *does* exist, I think sticking to the XML documentation would make more sense. – Jon Skeet Oct 06 '11 at 19:58
  • Well, I'm assuming the "includes" isn't there for the heck of it but implies the result should include more than "documentation comments from C# source code". But honestly I wouldn't bet on that either. Yes, the question isn't very clear, hopefully OP can clarify. –  Oct 06 '11 at 20:00
  • Added some clarification comments. – lucidquiet Oct 06 '11 at 20:44
  • @lucidquiet: It's still not clear what you'd want beyond XML documentation comments from C#. Those are the comments that the developer has decided should count as documentation. Comments such as "Got to work around bug in the method I'm calling..." don't really belong in documentation for that code, surely? – Jon Skeet Oct 06 '11 at 20:58
  • Had to upvote. The universe may implode otherwise... (and XML comments do sound like what you "really" want) – Earlz Oct 06 '11 at 21:47
  • Well what I want is to be able to build code tools. Refactoring, Documentation generation, code analysis, etc. But right now I don't want to write the parser (yet). Besides, more compilers should just come with a flag to output an AST of the code. And like the above post clarifies -- Semantic is one company that has these tools (so they exist). They just aren't cheap. So I was hoping to find an open-source or trial or free version that would produce this intermediate stage which could be used to various code analytics. – lucidquiet Oct 10 '11 at 17:38
  • @lucidquiet: Okay, if you actually want a full AST rather than just comments, then there's nothing in the *standard* tools yet. But editing... – Jon Skeet Oct 10 '11 at 17:40
0

The closest thing I know of is Sandcastle. It uses XML as intermediate output and processes it with XSLT to create HTML-style documentation over managed assemblies. It should be quite easy to get just the XML data out of it.

driis
  • 161,458
  • 45
  • 265
  • 341