I'm making HTTP-queries to a website and response I get is in XML-format. What I want to do is make multiple queries, parse data and have them in an ArrayList or some other container so I can easily access each query's data. I've been using some time to play with SAX for parsing the response. Examples I read had XML format like this:
<?xml version="1.0"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
I managed to parse format like this pretty easily just by looking at the examples on the internet.
But in my case I need to parse data like this:
<?xml version="1.0" encoding="UTF-8"?>
<root response="True">
<movie title="A Good Marriage" year="2014" rated="R" released="03 Oct 2014" runtime="102 min" genre="Thriller" director="Peter Askin" writer="Stephen King (short story)" actors="Joan Allen, Anthony LaPaglia, Stephen Lang, Cara Buono" plot="After 25 years of a good marriage, what will Darcy do once she discovers her husband's sinister secret?" language="English" country="USA" awards="N/A" poster="http://ia.media-imdb.com/images/M/MV5BMTk3MjY2ODgwNl5BMl5BanBnXkFtZTgwMTQ0Mjg0MjE@._V1_SX300.jpg" metascore="43" imdbRating="5.1" imdbVotes="2,016" imdbID="tt2180994" type="movie"/>
</root>
And from this response I want parse all the things to some container, so it's easy to use. I'm still learning things, maybe someone can help me out here, point me to right direction? :) Making queries is not a problem but parsing and storing data is.
EDIT: So to be more clear, my problem is that response from server isn't in neat XML-format like in the first example, you can see it's like this:
<movie title="A Good Marriage" year="2014" rated="R" released="03 Oct 2014" runtime="102 min" genre="Thriller" director="Peter Askin" writer="Stephen King (short story)" actors="Joan Allen, Anthony LaPaglia, Stephen Lang, Cara Buono" plot="After 25 years of a good marriage, what will Darcy do once she discovers her husband's sinister secret?" language="English" country="USA" awards="N/A" poster="http://ia.media-imdb.com/images/M/MV5BMTk3MjY2ODgwNl5BMl5BanBnXkFtZTgwMTQ0Mjg0MjE@._V1_SX300.jpg" metascore="43" imdbRating="5.1" imdbVotes="2,016" imdbID="tt2180994" type="movie"/>
And when I run my code, it doesn't print out anything but when I modify XML a bit manually like this:
<?xml version="1.0" encoding="UTF-8"?>
<root response="True">
<movie> title="Oblivion" year="2013" rated="PG-13" released="19 Apr 2013" runtime="124 min" genre="Action, Adventure, Mystery" director="Joseph Kosinski" writer="Karl Gajdusek (screenplay), Michael Arndt (screenplay), Joseph Kosinski (graphic novel original story)" actors="Tom Cruise, Morgan Freeman, Olga Kurylenko, Andrea Riseborough" plot="A veteran assigned to extract Earth's remaining resources begins to question what he knows about his mission and himself." language="English" country="USA" awards="10 nominations." poster="http://ia.media-imdb.com/images/M/MV5BMTQwMDY0MTA4MF5BMl5BanBnXkFtZTcwNzI3MDgxOQ@@._V1_SX300.jpg" metascore="54" imdbRating="7.0" imdbVotes="307,845" imdbID="tt1483013" type="movie"/>
</movie>
</root>
So I added ending tag >
for the movie-element and ending tag </movie>
to the end, my program prints it like:
Movie : title="Oblivion" year="2013" rated="PG-13" released="19 Apr 2013" runtime="124 min" genre="Action, Adventure, Mystery" director="Joseph Kosinski" writer="Karl Gajdusek (screenplay), Michael Arndt (screenplay), Joseph Kosinski (graphic novel original story)" actors="Tom Cruise, Morgan Freeman, Olga Kurylenko, Andrea Riseborough" plot="A veteran assigned to extract Earth's remaining resources begins to question what he knows about his mission and himself." language="English" country="USA" awards="10 nominations." poster="http://ia.media-imdb.com/images/M/MV5BMTQwMDY0MTA4MF5BMl5BanBnXkFtZTcwNzI3MDgxOQ@@._V1_SX300.jpg" metascore="54" imdbRating="7.0" imdbVotes="307,845" imdbID="tt1483013" type="movie"/>
So basically code I'm using at the moment reads everything between <movie>
and </movie>
, problem is that original response from the server leaves movie tag open like this: <movie title="Oblivion"...
and doesn't have </movie>
tag either.
I've been struggling pretty long with this, hopefully someone understands my confusing explanation! At the moment my parser code looks like this:
public void getXml(){
try {
// obtain and configure a SAX based parser
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
// obtain object for SAX parser
SAXParser saxParser = saxParserFactory.newSAXParser();
// default handler for SAX handler class
// all three methods are written in handler's body
DefaultHandler defaultHandler = new DefaultHandler(){
String movieTag="close";
// this method is called every time the parser gets an open tag '<'
// identifies which tag is being open at time by assigning an open flag
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if(qName.equalsIgnoreCase("MOVIE")) {
movieTag = "open";
}
}
// prints data stored in between '<' and '>' tags
public void characters(char ch[], int start, int length)
throws SAXException {
if(movieTag.equals("open")) {
System.out.println("Movie : " + new String(ch, start, length));
}
}
// calls by the parser whenever '>' end tag is found in xml
// makes tags flag to 'close'
public void endElement(String uri, String localName, String qName)
throws SAXException {
if(qName.equalsIgnoreCase("MOVIE")) {
movieTag = "close";
}
}
};
// parse the XML specified in the given path and uses supplied
// handler to parse the document
// this calls startElement(), endElement() and character() methods
// accordingly
saxParser.parse("xml/testi.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
Please anyone, help is greatly appreciated..