0

I'm running a linux server with tomcat and Railo. If I try this simple code:

check = "";
jSoupClass = createObject( "java", "org.jsoup.Jsoup" );

if(IsInstanceOf(jSoupClass,"org.jsoup.Jsoup")){
 check = "ok";      
}

writeDump(check );

If I run this simple code the var check is always empty. I run this simple test with many java class and all working perfectly.

In my app I use jsoup with no problem, but cannot seem to run this simple check. I use this for check the doctype of a document:

jSoupClass = createObject( "java", "org.jsoup.Jsoup" );
dom = jSoupClass.connect( "http://www.mutuiinpdap.net" ).userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6").timeout(10000).execute();

doc = dom.parse();
nods = doc.childNodes();
doctype = {};
for (key in nods) {
  if(IsInstanceOf(key,"org.jsoup.nodes.DocumentType")){
doctype.string = key.toString();
switch(key) {
 case "<!DOCTYPE html>":
      doctype.declarations = "Html 5";
  break;
     case '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">':
  doctype.declarations = "XHTML transitional";
  break;
    }
   }
  }

writeDump(doctype);

Is there a way to solve this? can I use a different code to check which doctype is a document?

This code run perfectly on my local Windows machine. But on my production server with Ubuntu installed on it breaks...

[Edit]

I have tested also with this code:

public function getDoctype(){



 myClass = {};


jSoupClass = createObject( "java", "org.jsoup.Jsoup" );
whois = createObject("java", "org.apache.commons.net.whois.WhoisClient");



myClass.jj = "ko";
myClass.ww = "ko";



writeDump(jSoupClass);  
writeDump(whois);   

if(IsInstanceOf(jSoupClass,"org.jsoup.Jsoup")){
    myClass.jj = "ok";      
}


if(IsInstanceOf(whois,"org.apache.commons.net.whois.WhoisClient")){
    myClass.ww = "ok";      
}


return myClass;

}

I will get a myClass.jj = "ko"; and myClass.ww = "ok";

Jeromy French
  • 11,812
  • 19
  • 76
  • 129
Tropicalista
  • 3,097
  • 12
  • 44
  • 72
  • I can't see why isInstanceOf isn't working - especially if it works on one install but not another - can you [raise a bug](https://issues.jboss.org/browse/RAILO) with relevant details. – Peter Boughton Feb 23 '13 at 13:07
  • I would like to raise a bug with relevan details, but there is not much more to add. I have add a try/catch to my code to see what's wrong, but get no errors. I have update my question with further example...I cannot be sure this is a railo bug or Jsoup bug. I've fired some question on both groups but no one seems to be interested... – Tropicalista Feb 23 '13 at 15:18
  • For reference, links to: [Railo Discussion](https://groups.google.com/forum/?fromgroups=#!topic/railo/NvylfseXN6Y) and [Jsoup Discussion](https://groups.google.com/forum/?fromgroups=#!topic/jsoup/edTpR8XtSFk). – Peter Boughton Feb 23 '13 at 15:57
  • The relevant details are (1) a clear description of what's misbehaving (i.e. isInstanceOf returning false when it possibly shouldn't) (2) the versions of Railo, Jsoup, and your JVM, and (3) a reproducable test-case - i.e. the code snippets in your question and the jsoup jar. If you [raise it on Jira](https://issues.jboss.org/browse/RAILO) Micha/Igal will see and investigate further - it might be closed as not a Railo issue, but that's probably a question for Micha as to whether isInstanceOf should work for static/non-inited objects. – Peter Boughton Feb 23 '13 at 16:07
  • 1
    *whether isInstanceOf should work for static/non-inited objects.* IMO it should. This might not be the issue, but I had a similar problem with ACF8/9 when using the javaLoader. `isInstanceOf` returned false because my object was created by a different class loader ie javaLoader than the main one used by CF. As Railo supports dynamic class loading it might be related(?). – Leigh Feb 23 '13 at 20:05

2 Answers2

1

For a DOCTYPE to be valid it must be the first thing in a document, so there's no need to loop through nodes checking instances.

All you have to do is examine the first tag - that is, the contents of the HTML string before the first > character (or, for some XHTML, between the first and second >s)

You also don't need (or want) a long list of full doctype declarations. Aside from HTML 5, they all follow the same pattern (i.e. have a DTD), so you can simply extract the name from the doctype.

I've wrapped that logic up into the function posted below - may need some tweaking or extra work to perform how you need, but it's been tested briefly and works. Hopefully it is all self-explanatory, but let me know if any parts are not.

jSoupClass = createObject( "java" , "org.jsoup.Jsoup" , "./jsoup-1.7.1.jar" );

doc = jSoupClass
    .connect( "http://www.mutuiinpdap.net" )
    .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
    .timeout(10000)
    .execute()
    ;

doctype = determineDoctype( doc.body() );

writeDump(doctype);



function determineDoctype( Html )
{

    var FirstTag = trim(ListFirst(Arguments.Html,'>'));

    if ( LCase(trim(FirstTag)) EQ '<!doctype html' )
        return 'Html 5';

    if ( Left(FirstTag,5) EQ '<?xml' )
        FirstTag = trim(ListGetAt(Arguments.Html,2,'>'));

    if ( Left(LCase(FirstTag),14) NEQ '<!doctype html' )
        return 'Non-HTML doctype [#FirstTag#]' ;

    var dtd = rematch('-//W3C//DTD [^/]+',FirstTag);

    if ( ArrayLen(dtd) )
        return ListRest(dtd[1],' ');

    return 'Unknown Doctype [#FirstTag#]';
}
Peter Boughton
  • 110,170
  • 32
  • 120
  • 176
1

You can test object.getClass().getName() EQ 'org.jsoup.Jsoup' to see if the class is what you expect.

Even though this would solve the issue with your code, I'd still recommend the other answer I posted for determining doctype.

Peter Boughton
  • 110,170
  • 32
  • 120
  • 176