17

I'm a 1st level C# programming student, though I've been dabbling in programming for a few years, and learning above and beyond what the class teaches me is just what I'm doing so that I'm thoroughly prepared once I get out into the job environment. This particular class isn't OOP at all, that's actually the next class, but for this project the teacher said he wouldn't mind if we went above and beyond and did the project in OOP (in fact you can't get an A in his class unless you go above and beyond anyways).

The project is(at this point) to read in an XML file, byte by byte, store element tags to one array, and the data values to another. I fought with him on this(given the .net frameworks dealing on XML) but that was a losing battle. He wants us to code this without using .net XML stuff.

He did provide an example of OOP for this program that he slopped together (originally written in Java, ported to C++, then ported from C++ to C#)

In his example he's got three classes. the first, XMLObject, which contains the arrays, a quasi constructor, getter and setter methods(not properties, which I plan to fix in my version), and a method for adding the < and > to tags to be stored in the arrays (and output to console if need be.)

The second class is a parseXML class. In this one he has fields that keep track of the line count, file offset, tag offset, and strings to hold elements and data. Again, he's got getter and setter methods, several parse methods that search for different things, and a general parse method that uses the other parse methods(sort of combines them here). Some of these methods make calls to the XMLObject class's methods, and send the parsed element and data values to their respective arrays.

The third class he has is one that has no fields, and has two methods, one for doing ATOI and one for dumping a portion of the file stream to the console.

I know we're essentially building a less efficient version of what's already included in the .net framework. I've pointed this out to him and was told "do not use .net's XML class, end of discussion" so let's all agree to just leave that one alone.

My question is, should those really be 3 separate classes. Shouldn't the parsing class either inherit from the XML object class, or just be coded in the XML object class, and shouldn't the ATOI and dumping methods be in one of those two classes as well?

It makes sense to me that if the parsing class's aim in life is to parse an XML file and store elements and data fields to an array, it should be in the same class rather than being isolated and having to do it through getters and setters(or properties in the version I'm going to do). I don't see why the arrays would need to be encapsulated away from the parse methods that actually give them what to store.

Any help would be appreciated, as I'm still designing this, and want to do it at least as close to "proper"(I know it's a relative term) OOP form.

xpda
  • 15,585
  • 8
  • 51
  • 82
Tyler W
  • 472
  • 1
  • 6
  • 21
  • 14
    I'm going to have to agree with the Prof on this one about not using .NET XML code. The point of the exercise isn't to parse XML, it's to process a file. Yeah, you can have .NET and other languages do it for you, but you're there to learn. I can have Excel generate lots of cool statistics on a data set, but if I don't understand statistics it's pretty much pointless – taylonr May 23 '11 at 13:36
  • 1
    Try posting some sample code and you might get more help reactoring – taylonr May 23 '11 at 13:38
  • I understand the point he's trying to do, however I've already done processing files (the last lab). My arguement was I already know how, as we've already learned it, and given that it's an 8 week class(accelerated) that is supposed to teach C# as well as delve into some .net work, that I should be able to use the xml classes now. I did the basic work (last lab was reading in and parsing a regular text file as well as creating a line index array) and would like to focus more on coding (the dean of the school agrees with me, but doesn't want to interfere). That's a bit off topic though,heh – Tyler W May 23 '11 at 13:40
  • 12
    When in doubt, follow the rules of the person who'll be grading your work. ;) – jalf May 23 '11 at 13:43
  • I assume that you and/or your professor is aware of `Int32.Parse`. – SLaks May 23 '11 at 13:44
  • 7
    I would add to the comments of others who note that the point of the exercise is not to create a useful XML parser, but rather to *teach you* how to build something. The point of pedagogic code is the *journey*, not the *destination*. Moreover: I interview people all the time who believe that the stuff in the framework is magic. It isn't. People write that stuff. **You could write that stuff if you worked for us.** If you don't know how to write stuff like what is already there, how are you going to write the next big thing? – Eric Lippert May 23 '11 at 14:10
  • The difficult part in solving a problem is to come up with its 'design'(a way to solve the problem) rather than implementing that design ('coding'). I think your professor wants you to focus on the former. – Devendra D. Chavan May 23 '11 at 14:12
  • @ Eric Lippert - Oh I fully understand the existence of the program, however in my case I can already do that coding (I did the same thing in a lab last week, using the newline character as the delimiter instead of XML's <> and >. My whole arguement was that since I know the basics, can I move on to exploring some of .net. The solution we came to was I do one version his way, one mine, and mine will count as extra credit. – Tyler W May 23 '11 at 14:28
  • Does anyone else think that a programming class that teaches you to do more than the required work is teaching terrible habits? – Gusdor Oct 26 '16 at 12:13

4 Answers4

10

The general rule is that we count the size of a class in the number of responsibilities that it has:

A class should have a single responsibility: a single reason to change.

It seems to me that your teacher did separate his responsibilities correctly. He separated the presentation from the xml parsing logic, and he separated the xml data from the xml parsing behavior.

Steven
  • 166,672
  • 24
  • 332
  • 435
  • So the storing of the actual data(elements and data values) should actually be one class, while parsing should be a separate? I was of the mindset that the parsing ability extended the functionality of the xml object. I'm just trying to understand the point of a class that only holds data while needing a completely seperate class that's only point is to change that specific data, and can't operate without it (since the parse calls the xmlobject class it's dependant on there being an xmlobject)...I'd post his code but it's looong, and confusing. – Tyler W May 23 '11 at 13:43
  • 1
    Not so sure about that. XmlObject sounds like data bag w/o any behavior and `parseXml` sounds like just a place holder for huge procedure that does everything. – Arnis Lapsa May 23 '11 at 13:44
  • 2
    Yes, you should definitely separate the data from the logic. You will start to appreciate this when you are starting to write unit tests and designing your application around the dependency injection principle. Mixing data and logic in a single class complicates many thing unnecessarily. – Steven May 23 '11 at 13:45
  • @ Arnis, That's what it looks like when I look at his code. If needed I'll edit my original question to include his 2 classes, but it's like 100 or so lines of code(if I remove his comments) – Tyler W May 23 '11 at 13:46
  • If you want to include code, just add the method signatures. Logic is irrelevant when doing architecture. If you can't figure out what's going on by just looking at the contract, something is very wrong. – jgauffin May 23 '11 at 13:50
  • @Steven I strongly disagree with Your point. proper object is defined by its behavior (primarily) **AND** data. using good old anthropomorphism - it would be like there were `raiseRightHand` "object" that would work with `Human` object (which would just describe height, weight, eye color, hair color etc.) as puppet. – Arnis Lapsa May 23 '11 at 13:52
  • 1
    @TylerW: In this case it seems like the data object is tightly coupled with the file format. But it is not too much of a step to think of a data object that could be represented multiple ways. In that case, it would make sense to have an `XmlParse` and a `JsonParser` and one data class. – unholysampler May 23 '11 at 13:53
  • @Arnis L: Yes, you are right. In OOP we do mix data and behavior, so my statement is a bit black and white. However, I think you agree with me that mixing the XML object and the parser into one object would be a bad idea. – Steven May 23 '11 at 13:57
  • @Steven, I think you meant grouping data and related behavior. `mix` is an incorrect term. – Devendra D. Chavan May 23 '11 at 14:16
  • @Steven I wouldn't mix it. I would not try to use OOP here at all. just few loops and few arrays. shouldn't be more than ~50 lines of code if not less. – Arnis Lapsa May 23 '11 at 14:26
8

First: If you're in a programming class, there may be a good reason he wants you to do this by hand: I really don't recommend arguing with your professors. You'll never win, and you can hurt your grades.

Second: His version is not (considering the fact that it is largely a re-writing of parts of the System.XML namespace) too terrible. Basically you have one class that "Is" your XML. Think of it like the XDocument or XmlDocument classes: Basically it just contains the Xml itself. Then You have your Xml Parser: think of that like XmlReader. And your last one is sort of his equivalent of XmlWriter.

Remember that with OOP, your Xml class (the one that represents the document itself) should neither know nor care how it came into possession of the information it has. Further, the Parser should know how to get the Xml, but it shouldn't much care where it gets stored. Finally, your Writer class shouldn't really care where the data is coming from, only where it's going.

I know it's over-used, but think of your program like a car- it has several parts that all have to work together, but you should be able to change any given part of it without majorly affecting the other pieces. If you lump everything in one class, you lose that flexibility.

AllenG
  • 8,112
  • 29
  • 40
  • That's a really good way to put it. That does make it a lot more clear. I spent a few days reading up on XmlDocument,reader,writer, and linq to xml, so I understood your example well. Thanks-- One question though, should the parseclass explicitly be calling the XML object class in it's code, or should that be done in main? I'll dig up an example of how he did it here in a sec. – Tyler W May 23 '11 at 13:50
  • if(!gInsideContainer) { if(String.Compare(gParseToken,XMLContainer)==0) { gInsideContainer=true; XMLObject.setAttribute("BEGPTR",gTagOffset.ToString()); } } else { gOpeningTag=gParseToken; } The indenting might of got garbled, when it's calling XMLObject.Attribute like that, wouldn't it need to know what the actual object is named to send the data there? I may be confusing myself. – Tyler W May 23 '11 at 13:57
  • "Remember that with OOP, your Xml class (the one that represents the document itself) should neither know nor care how it came into possession of the information it has." There is nothing wrong w/ knowing where information came from. Before trying to solve problem - one should understand it thoroughly in first place. In this case - what that special do You gain by making document class unaware of information source? OOP is not dividing things mechanically "to achieve greater flexibility and testability". – Arnis Lapsa May 23 '11 at 14:10
  • In this specific problem I don't see any actual need for the XMLonject class not knowing the information. Really, since it's a simple structural program turned into an OOP program for "above and beyond" there isn't any real reason to hide the data (in this case). That may change with later labs (all the labs from here on out build off of this one, in the end we'll have a program that stores student or medical personnel information in an xml file, allows different peoples data to be easily seen, and allows for add,change, and delete of the records – Tyler W May 23 '11 at 14:20
  • @Tyler for simple problems, imperative programming is superior to object-oriented programming. that's why it's so hard to artificially force OOP where there is little need for that. but I'm quite sure Your professor won't notice that and be more than happy if You will just manage to spit out few classes. :) – Arnis Lapsa May 23 '11 at 14:24
  • If the issue is that this is specifically an OOP Course, then you should use OOP Principles. That means separating this code. Moreover, the habit is a good one into which one should get- go ahead and make each class as simple as possible. Then, when all is said and done, you can re-factor to consolidate your code if it makes more sense. But don't limit yourself from the out-set. That way lies excessive re-factoring. – AllenG May 23 '11 at 14:45
  • yeah it's not an OOP class at all actually, just a structured beginner programming class. He just offered up the OOP example as a way to go "above and beyond" which is the only way to actually get an A in his class (just doing the work he assigns us the way he assigns it gets us a B) – Tyler W May 23 '11 at 23:05
2

Some points:

  • Classes are nouns; methods are verbs.
    Your class should be called XmlParser.

  • Since the XML parser is neither part of the XMLObject nor extends the XMLObject, it should be a separate class.

  • The third class has nothing to do with either of the other two; it's just an ordinary Utilities class.

  • In general, each class should be responsible for a single unit of work or storage.
    Don't try to put too much into a single class (see the "God object" anti-pattern).
    There's nothing wrong with having lots of classes. (As long as they all make sense)

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 2
    Naming something `Utilities`, `Common`, `Helper`, `Manager` is quite a strong code smell. :) – Arnis Lapsa May 23 '11 at 13:55
  • @Arnis: What would you suggest? – SLaks May 23 '11 at 13:57
  • In this case - those methods really are `Utilities` (cause they are highly irrelevant w/ problem we are trying to solve). But one thing I would do - I would tie them up a bit more using extension methods. – Arnis Lapsa May 23 '11 at 14:04
2

Let's summarize what the system must do :

  • to read in an xml file, byte by byte,
  • store element tags to one array,
  • the data values to another.

I would probably slice it up in the following way:

  • Reader : Given a file path, yields the contents byte-wise (IEnumerable<byte>)
  • Tokenizer: Given an enumeration of bytes, yields tokens relevant to the XML-Context (IEnumerable<XmlToken>)
  • XmlToken : Base class to any output that the tokenizer produces. For now you need 2 specializations :
    • Tag : An opening tag
    • Value : Contents of a tag
  • TokenDelegator : Accepts a Tokenizer and an instance of
  • IXmlTokenVisitor: (See Visitor pattern)
  • TagAndValueStore: Implements IXmlTokenVisitor. Visit(Tag tag) and Visit(Value value) are implented and the relevant content stored in arrays.

You see, I ended up with 7 classes and 1 interface. But you may notice that you have laid the foundations for a fully-fledged XML parser.

Often code that is sold to be OO just plain isn't. A class should adhere to the Single-Responsibility principle.

flq
  • 22,247
  • 8
  • 55
  • 77
  • 1
    Yes, this is a good design for a full fledged XML parser. BUT since OP is a 1st level C# programming student, I think this is somewhat over-engineered. One comment I like to make for OP is take all the advice in from this post but make a judgement yourself on how much concepts you want to incorporate into your design. After all, every "designer" should have his/her own preference/approach to consider a real "designer" and NOT follow rules after rules. – Thomas Li May 23 '11 at 13:54
  • It's tempting do down-vote cause this approach seems so complex. But that's just because problem itself is too simple for full-blown OOP approach. :) – Arnis Lapsa May 23 '11 at 13:59
  • Yeah that is a bit much for my level. I have a full on OOP class once I finish this class. I'm just trying to give myself a good familiarization for now. – Tyler W May 23 '11 at 14:00
  • @Tyler You might enjoy reading these: http://stackoverflow.com/questions/5484593/what-might-be-a-good-object-oriented-programming-book-that-can-give-a-good-solid/5484769#5484769 – Arnis Lapsa May 23 '11 at 14:28
  • Downvoting is a bit harsh, in my opinion :) This is to illustrate an example where an OOD may be heading to. Whether this is overly complex is debatable, sure it ain't easy for someone starting OOD. With time and experience, though this should be perceived as straightforward, since every part pretty much follows SRP. – flq May 23 '11 at 14:28
  • @flq I didn't down vote. actually - I think it's best answer here for same exact reason you stated. – Arnis Lapsa May 23 '11 at 14:32