0

I would like to understand the purposes of the files mentioned in this article and link the knowledge to my current COM server and COM client scenario, so that I can implement my COM server to use the COM server: this

I am having a COM server which is an exe, or service, that runs in the background. For now, I know there is an exposed interface inherited both from IUnknown and IDispatch. Besides I have the following files generated:

  1. xxx_i.c defines all the CLSIDs and IIDs

  2. xxx_i.h defines all the method the interface supports

  3. xxx_p.c ?

  4. dlldata.c ?

I am now using the automation way, IDispatch -> Invoke(), to access the interface methods. Although this way seems work fine without using any files mentioned above, I still would like to understand the purposes of them while using the normal way, IUnknown -> QueryInterface(), to access the methods.

Since I am new to the COM world, any suggested reading would be appreciated! Thanks!

Nick Chang
  • 1
  • 1
  • 4
  • https://stackoverflow.com/questions/13700266/com-include-generated-header-vs-import-generated-tlb – Hans Passant Apr 28 '19 at 18:01
  • @HansPassant Thanks! Since I don't have enough reputations to comment on that post, I need your suggestion. Based on that post, it seems that include _i.h merely will require handling "marshaling" myself, but include .tlb won't. Am I interpreting correctly? – Nick Chang May 07 '19 at 12:25
  • The type library is for the programmer that uses the server. When you create the server then you use the .h file. Marshaling is the job of the proxy/stub, it is built from the two .c files. You'd be wise to use the ATL project template in Visual-C++, along with the class wizard, all of this stuff gets sorted out automagically. – Hans Passant May 07 '19 at 12:35
  • @HansPassant 1. Isn't the .h file generated by MIDL compiler? What do you mean by "create the server"? 2. Do I have to generate the proxy/stub first for the client if I use tlb? If so, why I am using .h file on the client still works? – Nick Chang May 07 '19 at 12:48
  • 1. Yes. "COM server and COM client", that server. 2. The client can use the tlb to marshal if the interfaces are simple enough and you are taking care of the required registry keys, it is not as efficient. You can only use the .h file in the client if it is written in C++, the type library can be used by about any language. Even if you use C++ for the client then you still tend to prefer #import, the auto-generated wrappers are pretty nice. – Hans Passant May 07 '19 at 12:54
  • I kinda understand it now. Once IDL file is defined, MIDL will generate a .h file. On the server side, I need to include the .h file and implement the interface methods. On the client side, I can either include the same .h file or .tlb(generated by MIDL from the beginning) according to the language is used. For the marshaling, proxy/stub files are not mandatory on the client side if the server is simple(?) enough and taking care of the required registry keys(?). Either include .h or import .tlb will be sufficient. Is this correct? – Nick Chang May 07 '19 at 13:14
  • That looks correct. – Hans Passant May 07 '19 at 13:31

2 Answers2

1

In its most simple form, COM is only the vtable binary contract plus the mother of all interfaces: IUnknown. COM is a way to reuse code without source, with components, it's some kind of a dynamic casting mechanism. Provided I know the coclasses you support (their CLSID), the interfaces they expose (their IID), and what are these interfaces' methods layout, their parameters, order, type, etc., I can use your COM server.

But to ease "communication" between your COM clients and your COM server, you can/should use some standard mechanisms/documentation and add tooling so plumbing stuff like marshaling (=serialization) will be taken care w/o any effort. This is crucial in the out-of-process case, less important with in-process (I will elude the "apartment" concept here...)

So, lots of things you'll find in COM (like registration, tooling, IDL, typelibs, etc.) are in fact optional, but also very useful (so they kinda become mandatory in the end). The purpose of things like idl (for "interface language definition") is to define and expose to your COM clients what your COM server supports so tooling can generate a lot of code automatically for you and your clients (.c, .h, .tlb). Note that nothing prevents you from implementing interfaces or coclasses without defining them in idl. Nothing obliges you to provide your .idl or your .tlb. In this case, I will only be able to use them if I known their IID, method layout etc.

Then, on top of IUnknown, Microsoft created a universal interface called IDispatch (this is also known as "Automation", or "Late binding" as opposed to "Early binding" for IUnknown), at that time targeting VB/VBA clients (before even VBScript, JScript, and lots of other COM clients, .NET supports IUnknown and IDispatch). IDispatch, if you go that route, could be the last interface you'll ever have to implement, because its semantics allows full discovery and invocation of any method, provided it supports a finite set of defined data types, the "Automation types": BSTR, VARIANT, etc.

So, if you support IDispatch, provide a TLB (typelibs) and restrict all types to Automation types, then you don't need to handle marshaling, you don't need proxies and stubs, you can forget about all this, even in out-of-process scenarios, because Microsoft implements that automatically. Back in the days, we used to call "oleaut32.dll" the "universal marshaler".

Dual interfaces are interfaces that support both IUnknown and derivates and IDispatch at the same time. They mostly exist to support C/C++ clients and Automation clients at the same time. Using Automation (BSTR, VARIANT, etc.) is a bit painful in C/C++ because they were not intended originally to be used by C/C++ clients... Note Microsoft proposes C++ smart wrappers classes: CComBSTR and CComVARIANT with ATL, or _variant_t and _bstr_t with the Windows SDK.

Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
  • Thanks for the explanation! Regarding to the third paragraph, assuming using IUnknown, do I have to handle marshaling myself? If yes, is it related to proxies and stubs also? – Nick Chang May 07 '19 at 12:12
  • In general, you rarely have to handle marshaling yourself if you use associated tooling. For example, when using Visual Studio it will build a proxy and stub projects (or files that you include in your main project). These will do the marshaling. These proxy an stub are useless if you use TLB, IDispatch, and restrict to automation types, but used in the IUnknown scenarios. If you use zero tooling (no Visual Studio for example, no IDL/MIDL) and want to do out-of-process, yes, you'll have to build all that by yourself (this is not recommended) – Simon Mourier May 07 '19 at 15:07
0

Requests for reading material are out of scope for StackOverflow, but I can't help but to reccomend the seminal work by Don Box: Essential COM which is in print and available as an ebook elsewhere. Here's Don's description of the topic:

Box, Don. Essential COM. Addison-Wesley, 1998, pp. 350:

COM is based on client programs having a priori knowledge of an interface's definition at development time. This is accomplished either through C++ header files (for C++ clients) or through type libraries (for Java and Visual Basic clients). In general, this is not a problem, as programs written in these languages typically go through some sort of compilation phase prior to being deployed. Some languages do not go through such a compilation phase at development time and instead are deployed in source code form to be interpreted at runtime.

Perhaps the most pervasive of such languages are HTML based scripting languages (e.g., Visual Basic Script, JavaScript) that execute in the context of either a Web browser or a Web server. In both of these cases, script text is stored in its raw form embedded in an HTML file, and the surrounding runtime executes the script text on the fly as the HTML is parsed. To provide a rich programming environment, these environments allow scripts to invoke methods on COM objects that may be created in the script text itself or perhaps elsewhere in the HTML stream (e.g., a control that is also part of the Web page). In these environments, it is currently impossible to use type libraries or other a priori means to provide the runtime engine with a description of the interfaces being used. This means that the objects themselves must assist the interpreter in translating the raw script text into meaningful method invocations.

To allow objects to be used from interpretive environments such as Visual Basic Script and JavaScript, COM defines an interface that expresses the functionality of interpretation.


Tl;dr: there are two ways to do everything in COM (ignoring IInspectable and dual interfaces):

  1. IUnknown
    Standard virtual method invocation. Fast, no extra code. Requires compile time interface information (.h or .tlb) on the client calls
  2. IDispatch
    "Late Binding". Slow, lots of interpreting code. No client compilation or interface spec needed.

Practically speaking, unless you are calling from VBA, VBScript or have some old VB6 clients then you are better off sticking with IUnknown exclusively.

Mitch
  • 21,223
  • 6
  • 63
  • 86
  • Thanks for the reply! Now I make my com client work with xxx_i.h merely, but I still do not know how this works without any coclass definition like in the .tlb file? – Nick Chang May 07 '19 at 11:55
  • And also xxx_i.h lacks of UUID definitions like in xxx_i.c as well. How would the client send to and receive from the server correctly? – Nick Chang May 07 '19 at 12:01
  • With only the interface definition, you can call a com server by casting the result of a call to a `CoCreateInstance` or another factory or interface retrieval mechanism (the Running Objects Table, for example). CoClasses are just caller-side constructs to make instantiation and QueryInterface easier. COM is a system to pass Interface references - not class references. – Mitch May 07 '19 at 16:43