33

I am designing an entity class which has a field named "documentYear", which might have unsigned integer values such as 1999, 2006, etc. Meanwhile, this field might also be "unknown", that is, not sure which year the document is created.

Therefore, a nullable int type as in C# will be well suited. However, Java does not have a nullable feature as C# has.

I have two options but I don't like them both:

  1. Use java.lang.Integer instead of the primitive type int;
  2. Use -1 to present the "unknown" value

Does anyone have better options or ideas?

Update: My entity class will have tens of thousands of instances; therefore the overhead of java.lang.Integer might be too heavy for overall performance of the system.

yinyueyouge
  • 3,684
  • 4
  • 25
  • 22

13 Answers13

33

Using the Integer class here is probably what you want to do. The overhead associated with the object is most likely (though not necessarily) trivial to your applications overall responsiveness and performance.

Matthew Vines
  • 27,253
  • 7
  • 76
  • 97
27

You're going to have to either ditch the primitive type or use some arbitrary int value as your "invalid year".

A negative value is actually a good choice since there is little chance of having a valid year that would cause an integer overflow and there is no valid negative year.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
hhafez
  • 38,949
  • 39
  • 113
  • 143
  • 20
    ...unless you want to be able to represent both BC and AD years with the schema. ;-) – Anders Lindahl Jun 12 '09 at 06:23
  • 20
    Use 0. There is no year zero, BC or AD. – Zac Thompson Nov 10 '10 at 07:52
  • 10
    Unless you're domain is astronomical, where year 0 is valid. – Chadwick Apr 23 '12 at 23:33
  • 3
    The year 2147483647 is also quite unlikely and could be used as the arbitrary int value. – Simon Forsberg Aug 20 '12 at 14:38
  • 30
    @SimonAndréForsberg - Of course, that's what they said about the year 2000. – Nobody Aug 27 '12 at 16:49
  • Well chances of the program getting deprecated and discarded by lack of usage by year 2147483647 is really big, Simon is quite right. – Felype Dec 01 '15 at 12:58
  • Until time travel gets invented, which I'm sure sounds as ridiculous to us now as cheap 1TB hard drives did to programmers 30 years ago – sh4d0w Feb 09 '16 at 06:15
  • 1
    Strongly disagree. This breaks Single Responsibility and is likely to produce bugs because you are expressing multiple (2) concerns with a single value. You should really invent a type that has the ability two express the absence of the value. And no, the **null** value of a boxed primitive still expresses two concerns within one value. – Noel Widmer Nov 16 '17 at 18:45
16

Tens of thousands of instances of Integer is not a lot. Consider expending a few hundred kilobytes rather than optimise prematurely. It's a small price to pay for correctness.

Beware of using sentinel values like null or 0. This basically amounts to lying, since 0 is not a year, and null is not an integer. A common source of bugs, especially if you at some point are not the only maintainer of the software.

Consider using a type-safe null like Option, sometimes known as Maybe. Popular in languages like Scala and Haskell, this is like a container that has one or zero elements. Your field would have the type Option<Integer>, which advertises the optional nature of your year field to the type system and forces other code to deal with possibly missing years.

Here's a library that includes the Option type.

Here's how you would call your code if you were using it:

partyLikeIts.setDocumentYear(Option.some(1999));

Option<Integer> y = doc.getDocumentYear();
if (y.isSome())
   // This doc has a year
else
   // This doc has no year

for (Integer year: y) {
  // This code only executed if the document has a year.
}
Apocalisp
  • 34,834
  • 8
  • 106
  • 155
  • 1
    How is -1 not a year? 1 BC is a year, and you would write that as -1 as an integer. I'd say that Integer.MAX_VALUE is probably not a realistic value for an year for quite some time. – elmuerte Jun 12 '09 at 06:58
  • 3
    Right you are, it's 0 that isn't a year. But that's not the point. The point is that sentinel values lose meaning and defeat the type system. – Apocalisp Jun 12 '09 at 13:18
  • I agree with Apocalisp's point about sentinel values. The interpretation of '-1' as an error value needs to be documented. A programmer can inadvertently use error values in code. The program will exhibit 'garbage in, garbage out'. However, using the Option type is a mechanical way to force programmers to think about the possibility of an invalid year. – fatuhoku Jan 20 '12 at 10:33
  • I realized today that Java's `Integer` and C#'s `int?` behaves differently when comparing to a non-nullable. C# performs the comparison but Java throws NullPointerException. So Option is a good choice. – Simon Forsberg Feb 01 '13 at 19:37
2

Another option is to have an associated boolean flag that indicates whether or not your year value is valid. This flag being false would mean the year is "unknown." This means you have to check one primitive (boolean) to know if you have a value, and if you do, check another primitive (integer).

Sentinel values often result in fragile code, so it's worth making the effort to avoid the sentinel value unless you are very sure that it will never be a use case.

Eddie
  • 53,828
  • 22
  • 125
  • 145
  • That won't be a primitive type – hhafez Jun 12 '09 at 05:34
  • Think outside the box. If you will not use an Object and insist on using a primitive type to hold a value, then the safest way to know whether or not that primitive type has a value is to have an associated boolean value that tells you whether or not the primitive type contains a value or not. Anything else you do relies on using a "sentinel" value which is often risky and results in fragile code. – Eddie Feb 25 '12 at 00:18
1

For completeness, another option (definitely not the most efficient), is to use a wrapper class Year.

class Year {
    public int year;
    public Year(int year) { this.year = year; }
}

Year documentYear = null;
documentYear = new Year(2013);

Or, if it is more semantic, or you want multiple types of nullable ints (Other than Years), you can imitate C# Nullable primitives like so:

class Int {
    public int value;
    public Int(int value) { this.value = value; }
    @Override 
    public String toString() { return value; }
}
azz
  • 5,852
  • 3
  • 30
  • 58
  • Oh, and somewhat relevant, Java 8 will have a [`java.time.Year`](http://javadocs.techempower.com/jdk18/api/java/time/Year.html) Class – azz Apr 15 '13 at 08:24
1

Using the int primitive vs the Integer type is a perfect example of premature optimization.

If you do the math:

  • int = N(4)
  • Integer = N(16)

So for 10,000 ints it'll cost 40,000 bytes or 40k. For 10,000 ints it'll cost 160,000 bytes or 160K. If you consider the amount of memory required to process images/photos/video data that's practically negligible.

My suggestion is, quit wasting time prematurely optimizing based on variable types and look for a good data structure that'll make it easy to process all that data. Regardless of that you do, unless you define 10K primitive variables individually, it's going to end up on the heap anyway.

Evan Plaice
  • 13,944
  • 6
  • 76
  • 94
1

You can use a regular int, but use a value such as Integer.MAX_VALUE or Integer.MIN_VALUE which are defined constants as your invalid date. It is also more obvious that -1 or a low negative value that it is invalid, it will certainly not look like a 4 digit date that we are used to seeing.

Kekoa
  • 27,892
  • 14
  • 72
  • 91
1

If you have an integer and are concerned that an arbitrary value for null might be confused with a real value, you could use long instead. It is more efficient than using an Integer and Long.MIN_VALUE is no where near any valid int value.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
0

If you're going to save memory, I would suggest packing several years in a single int. Thus 0 is nil. Then you can make assumptions in order to optimize. If you are working only with the current dates, like years 1970—2014, you can subtract 1969 from all of them and get into 1—55 range. Such values can be coded with only 6 bits. So you can divide your int which is always 32 bit, into 4 zones, with a year in there. This way you can pack 4 years in the range 1970—2226 into a single int. The more narrow your range is, like only 2000—2014 (4 bits), the more years you can pack in a single int.

Aleks N.
  • 6,051
  • 3
  • 42
  • 42
0

You could use the @Nullable annotation if using java 7

user3231931
  • 331
  • 1
  • 2
  • 8
0

java.lang.Integer is reasonable for this case. And it already implemented Serializable, so you can save just only the year field down to the HDD and load it back.

Truong Ha
  • 10,468
  • 11
  • 40
  • 45
0

Another option might be to use a special value internally (-1 or Integer.MIN_VALUE or similar), but expose the integer as two methods:

hasValue() {
    return (internalValue != -1);
}

getValue() {
    if (internalValue == -1) {
        throw new IllegalStateException(
            "Check hasValue() before calling getValue().");
    }
    return internalValue;
}
MB.
  • 7,365
  • 6
  • 42
  • 42
0

What's wrong with java.lang.Integer? It's a reasonable solution, unless you're storing very large amounts of this value maybe.

If you wanted to use primitives, a -1 value would be a good solution as well. The only other option you have is using a separate boolean flag, like someone already suggested. Choose your poison :)

PS: damn you, I was trying to get away with a little white lie on the objects vs structs. My point was that it uses more memory, similar to the boolean flag method, although syntactically the nullable type is is nicer of course. Also, I wasn't sure someone with a Java background would know what I meant with a struct.

Thorarin
  • 47,289
  • 11
  • 75
  • 111
  • nullable types in c# are structs. They are not boxed. – Matthew Vines Jun 12 '09 at 05:36
  • Nullable values are not boxed in C# as I know. Correct me if I am wrong. – yinyueyouge Jun 12 '09 at 05:37
  • Its a really common misconception. I was guilty of it myself until just recently. But check out http://msdn.microsoft.com/en-us/library/1t3y8s4s.aspx The first sentence is "Nullable types are instances of the System.Nullable struct." – Matthew Vines Jun 12 '09 at 05:40