12

We've got an incredibly frustrating situation with a CF Web Services-based API that we wrote and maintain. We had an API in place for years that was stable and working happily with Ruby, PHP, and ColdFusion clients. Then this year a .NET client came along, and we found that our web service was not interoperable with statically-typed languages due to our extensive use of structs.

We eventually realized we had to re-write the API without structs, and we've done so. It now uses scaler values, arrays, and CFCs (which get translated to SOAP complexTypes). The .NET client is happy, and we wrote proof-of-concept clients in about 6 different languages to ensure that we'd be interoperable this time around.

To our great dismay, it appears that our ColdFusion 7 servers can't serve the new API reliably. It works for about a day or so after restarting, then the clients start getting errors like:

Error: coldfusion.xml.rpc.CFCInvocationException [java.lang.ClassNotFoundException : tafkan.remote_api.pfapi.v.trunk.rsp_pf_survey_status_array]

and

java.lang.NoClassDefFoundError: tafkan/remote_api/pfapi/v/trunk/pf_unit

Restarting the CF instances is the only way to make the problem go away. A lot of time and money was put into rebuilding the API, so everyone is really at wit's end about this.

We've noticed that the WEB-INF/cfc-skeletons directories of our CF instances eventually seem to have two copies of the classes for each of the CFCs used by the API. For example:

-rw-r--r--  Feb 17 09:15 remote_api.pfapi.v.trunk.pf_datum.class
-rw-r--r--  Feb  3 12:20 tafkan.remote_api.pfapi.v.trunk.pf_datum.class

It seems like the errors are coming from a namespace or class search path problem, so we tried switching all CFC references to be fully-qualified (dot notation starting with a mapping) instead of just simple references to CFCs in the current directory. This seemed promising, but the problem came back within 24 hours.

Environment:

  • ColdFusion 7,0,2,142559 with hf702-70523, 2-instance cluster
  • Sun Java 1.4.2_13
  • Apache 2.0.52
  • Centos 4.5 32-bit

Maybe upgrading one of these venerable pieces of software would help? Maybe upgrading just AXIS?

Adobe support doesn't seem to be an option, as CF7 is EOL'ed and in extended-extended support (and that just for a few more days).

Update:

Thanks to all who've joined this discussion! Here's an update on where things stand at the moment.

The service just crapped out for the first time today. One of the cluster instances was still able to generate the WSDL, while the other instance said:

AXIS error
Sorry, something seems to have gone wrong... here are the details:
Exception - java.lang.NoClassDefFoundError: tafkan/remote_api/pfapi/v/trunk/rsp_pf_numeric_array

Both cfc-skeletons directories contain a file called tafkan.remote_api.pfapi.v.trunk.rsp_pf_numeric_array.class, and did not appear to contain the otherly-named files we've sometimes seen (remote_api.pfapi.v.trunk.rsp_pf_numeric_array.class). The files in cfc-skeletons do not appear to have been modified since the servers were started yesterday.

The uptime on both instances was about 21.5 hours. I was running without JIT (-Xint).

I've now restarted both instances. They're now running on Sun Java 1.4.2_19 (instead of _13), and JIT has been re-enabled as it clearly wasn't causing this error and was things were dramatically slower without it. I've also cleared the "save class files" check boxes.

And now, we wait again...

Update 2 The problem persists. I'm not sure what else to try at this point. Arg!

FYI, this is cross-posted at http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:60922

Cœur
  • 37,241
  • 25
  • 195
  • 267
sbleon
  • 1,685
  • 2
  • 13
  • 20
  • I would start with jumping to a much newer JVM and consider breaking the cluster (just round robin the request if you can). Also remember you can set up a pair of CF8 or CF9 servers for an internal test free if you can hit them from only 1 or 2 IPs – kevink Feb 20 '10 at 04:43
  • 1
    I had a bit similar issue with WSDL namespaces. Solution was to use .cfm container for generating appropriate web-service information. Maybe this can work for you too, see this QA http://stackoverflow.com/questions/1119721/duplicate-file-name-for-same-wsdl-namespace-when-using-web-service-from-differe/1126143#1126143 – Sergey Galashyn Feb 20 '10 at 10:19
  • uh, ColdFusion just uses Apache Axis in the background (Java....totally strongly typed last I checked) .NET should have no problem consuming structs. I Do it all the time. I think your dealing with some sub-par .net developers and you should go back to structs – ryber Feb 20 '10 at 23:43
  • @kevink, thanks for the comment. I can try upgrading the JVM. We're running this app on Apple's Java 1.5 and 1.6 on our dev workstations, so an ugprade should be possible. I can't break the cluster, as it's the only thing preventing us from having serious downtime when the JVM crashes or when we have to restart one of the servers. – sbleon Feb 21 '10 at 18:37
  • @Sergii, thanks for the link. It seems like it could be related to my problem, but I'm not exactly sure how. Instead of two classes getting generated with the same name, I'm getting one class created with two different names! – sbleon Feb 21 '10 at 19:18
  • @ryber, thanks. I have not yet seen a .NET app that can understand CF's structs. How could it when the values have no known type? What does it treat them as? If you're actually able to do this, I wish I'd talked to you two months ago before we rewrote the whole thing! – sbleon Feb 21 '10 at 19:33
  • @sbleon the struct comes over as a Map of AnyType (including other maps of AnyType) There are rules for dealing with this under .net xml serialization but in the end they are all either maps or strings. A .net developer can ALWAYS consume a web service as a XML document rather than de/serialized objects, which is what they would have to do. – ryber Feb 22 '10 at 03:01
  • Hi there, I'm interested in the comments which mention .net should have no problem consuming structs - I asked a similar question myself and got no responses http://stackoverflow.com/questions/1132536/consuming-apachesoapmap-complex-datatype-in-webservice-using-net – Loftx Feb 22 '10 at 09:06
  • Answered @Loftx's question which may be of interest to you. I realized at the end of it that the quickest solution, rather than deal with the struct or cfc serialization would be to just return a XML string with the results. – ryber Feb 22 '10 at 18:53

2 Answers2

3

I've read this thread, and the CFTalk thread. My initial thoughts about workarounds appear to have been already suggested by Mark Kruger and Dave Watts. The only other workaround idea I had was to catch the error and refresh the webservice stub using the Service Factory methods. (In CF8-9 there is a Admin API method to do this, not sure about CF7).

Researching the error I narrowed down possible matches to these:

http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:144821 This was a match but unresolved

http://blog.coldfusionpowered.com/?p=28 This was a very similar error, resolved by "fixing case issues" on all CFCs & invocations.

ColdFusion Google Adwords Business Component Error Resolved by rewriting code and removing cfcomments (I suspect that other factors were actually responsible for solving it here)

http://forums.crystaltech.com/index.php?topic=22364.0 We're getting closer now. Resolution involved mistakenly having two document roots

http://qaix.com/coldfusion/313-410-web-service-on-cfmx-6-1-jrun-suddenly-not-working-read.shtml Exact match for error message. Exact match for having CFC mapping to doc root. Resolution was to have only 1 mapping pointing to docroot, just "/". This could be the solution. In MX 6/6.1 and maybe 7, there was a default mapping for "/" pointing to docroot. If you have another mapping pointing to docroot, then I can see how this problem might arise. Check the physical paths for mappings and try the solution here, to use only the "/" mapping.

Community
  • 1
  • 1
Steven Erat
  • 549
  • 2
  • 7
  • Thanks, Steven, for your detailed research. I checked and there is no "/" mapping defined. There is a /tafkan, which I'm using to find my CFCs within the API code. Since you can't use a "/" in a dot-delimited object path, it seems like my only options are to use unqualified, local object paths (E.g. "pf_unit"), which didn't work (the same problem I have now), or to use "fully-qualified" paths starting with "tafkan.", which I'm already doing. I also noticed that the last reference you gave, which sounded promising, was an every-time error, not an intermittent one. Any other ideas? – sbleon Mar 08 '10 at 20:28
0

How are the external clients interacting with your webservice? Just via the WSDL I presume?

Is it possible that some client app, a unit test... something, anything ... has a wrong URL... has a URL to your WSDL file with the "tafkan" in it?

If I were working on it, probably the first avenue I'd look down would be figuring out what could possibly result in that problem. Is "tafkan" a valid directory in your system? Where do the .cfc files actually live on the file system, what if any mappings are there to these paths in CF Admin, and what are the URLs that people are using to access your webservice?

The key here, I believe, is getting inside CF's head and asking it "why would you generate, and be looking for, a class with "tafkan" as a package?

marc esher
  • 4,871
  • 3
  • 36
  • 51
  • Thanks, Marc. Everyone's just using the WSDL endpoint. "tafkan" is a CF mapping that points to the web root of our application (/var/www/tafkan/htdocs). The CFCs live at /var/www/tafkan/htdocs/remote_api/pfapi/v/trunk/ . I'd prefer not to list the full URLs here, but they're of the form (https://CLIENT_SITE_URL/remote/pfapi/v/trunk/pfapi.cfc/wsdl). Your suggestion about getting inside CF's head is a good one, but I just have no idea about how to go about doing it. – sbleon Feb 21 '10 at 19:17
  • Marc, I'm pretty sure that the class names are right, and that CF just stops being able to find the class after some time. The WSDL's got targetNamespace="http://trunk.v.pfapi.remote_api.tafkan" in its schema tag, so the class name pf_unit.trunk.v.pfapi.remote_api.tafkan seems right. – sbleon Mar 01 '10 at 15:03
  • I'm at a loss man. Maybe contact Steven Erat (talkingtree.com), who used to be a CF support engineer with Allaire, MM, and Adobe – marc esher Mar 03 '10 at 22:07