Boxing (computer science)

From HandWiki
Short description: Programming language concept

In computer science, boxing (a.k.a. wrapping) is the transformation of placing a primitive type within an object so that the value can be used as a reference. Unboxing is the reverse transformation of extracting the primitive value from its wrapper object. Autoboxing is the term for automatically applying boxing and/or unboxing transformations as needed.

Boxing

Boxing's most prominent use is in Java where there is a distinction between reference and value types for reasons such as runtime efficiency and syntax and semantic issues. In Java, a

Computer code
LinkedList
can only store values of type 
Computer code
Object

. One might desire to have a

Computer code
LinkedList
of 
Computer code
int

, but this is not directly possible. Instead Java defines primitive wrapper classes corresponding to each primitive type:

Computer code
Integer
and 
Computer code
int

,

Computer code
Character
and 
Computer code
char

,

Computer code
Float
and 
Computer code
float

, etc. One can then define a

Computer code
LinkedList
using the boxed type 
Computer code
Integer
and insert 
Computer code
int
values into the list by boxing them as 
Computer code
Integer
objects. (Using generic parameterized types introduced in J2SE 5.0, this type is represented as 
Computer code
LinkedList<Integer>

.)

On the other hand, C# has no primitive wrapper classes, but allows boxing of any value type, returning a generic

Object

reference. In Objective-C, any primitive value can be prefixed by a Template:ObjC to make an Template:ObjC out of it (e.g. Template:ObjC or Template:ObjC). This allows for adding them in any of the standard collections, such as an Template:ObjC.

Haskell has little or no notion of reference type, but still uses the term "boxed" for the runtime system's uniform pointer-to-tagged union representation.[1]

The boxed object is always a copy of the value object, and is usually immutable. Unboxing the object also returns a copy of the stored value. Repeated boxing and unboxing of objects can have a severe performance impact, because boxing dynamically allocates new objects and unboxing (if the boxed value is no longer used) then makes them eligible for garbage collection. However, modern garbage collectors such as the default Java HotSpot garbage collector can more efficiently collect short-lived objects, so if the boxed objects are short-lived, the performance impact may not be severe.

In some languages, there is a direct equivalence between an unboxed primitive type and a reference to an immutable, boxed object type. In fact, it is possible to substitute all the primitive types in a program with boxed object types. Whereas assignment from one primitive to another will copy its value, assignment from one reference to a boxed object to another will copy the reference value to refer to the same object as the first reference. However, this will not cause any problems, because the objects are immutable, so there is semantically no real difference between two references to the same object or to different objects (unless you look at physical equality). For all operations other than assignment, such as arithmetic, comparison, and logical operators, one can unbox the boxed type, perform the operation, and re-box the result as needed. Thus, it is possible to not store primitive types at all.

Autoboxing

Autoboxing is the term for getting a reference type out of a value type just through type conversion (either implicit or explicit). The compiler automatically supplies the extra source code that creates the object.

For example, in versions of Java prior to J2SE 5.0, the following code did not compile:

Integer i = new Integer(9);
Integer i = 9; // error in versions prior to 5.0!

Compilers prior to 5.0 would not accept the last line.

Computer code
Integer
are reference objects, on the surface no different from 
Computer code
List

,

Computer code
Object

, and so forth. To convert from an

Computer code
int
to an 
Computer code
Integer

, one had to "manually" instantiate the Integer object. As of J2SE 5.0, the compiler will accept the last line, and automatically transform it so that an Integer object is created to store the value

Computer code
9

.[2] This means that, from J2SE 5.0 on, something like

Computer code
Integer c = a + b

, where

Computer code
a
and 
Computer code
b
are 
Computer code
Integer
themselves, will compile now - a and b are unboxed, the integer values summed up, and the result is autoboxed into a new 
Computer code
Integer

, which is finally stored inside variable

Computer code
c

. The equality operators cannot be used this way, because the equality operators are already defined for reference types, for equality of the references; to test for equality of the value in a boxed type, one must still manually unbox them and compare the primitives, or use the

Computer code
Objects.equals
method.

Another example: J2SE 5.0 allows the programmer to treat a collection (such as a

Computer code
LinkedList

) as if it contained

Computer code
int
values instead of 
Computer code
Integer
objects. This does not contradict what was said above: the collection still only contains references to dynamic objects, and it cannot list primitive types. It cannot be a 
Computer code
LinkedList<int>

, but it must be a

Computer code
LinkedList<Integer>
instead. However, the compiler automatically transforms the code so that the list will "silently" receive objects, while the source code only mentions primitive values. For example, the programmer can now write 
Computer code
list.add(3)
and think as if the 
Computer code
int
Computer code
3
were added to the list; but, the compiler will have actually transformed the line into 
Computer code
list.add(new Integer(3))

.

Automatic unboxing

With automatic unboxing the compiler automatically supplies the extra source code that retrieves the value out of that object, either by invoking some method on that object, or by other means.

For example, in versions of Java prior to J2SE 5.0, the following code did not compile:

Integer k = new Integer(4);
int l = k.intValue(); // always okay
int m = k;            // would have been an error, but okay now

C# doesn't support automatic unboxing in the same meaning as Java, because it doesn't have a separate set of primitive types and object types. All types that have both primitive and object version in Java, are automatically implemented by the C# compiler as either primitive (value) types or object (reference) types.

In both languages, automatic boxing does not downcast automatically, i.e. the following code won't compile:

C#:

int i = 42;
object o = i;         // box
int j = o;            // unbox (error)
Console.WriteLine(j); // unreachable line, author might have expected output "42"

Java:

int i = 42;
Object o = i;          // box
int j = o;             // unbox (error)
System.out.println(j); // unreachable line, author might have expected output "42"

Type helpers

Modern Object Pascal has yet another way to perform operations on simple types, close to boxing, called type helpers in FreePascal or record helpers in Delphi and FreePascal in Delphi mode.
The dialects mentioned are Object Pascal compile-to-native languages, and so miss some of the features that C# and Java can implement. Notably run-time type inference on strongly typed variables.
But the feature is related to boxing.
It allows the programmer to use constructs like

{$ifdef fpc}{$mode delphi}{$endif}
uses sysutils;  // this unit contains wraps for the simple types
var
  x:integer=100;
  s:string;
begin
  s:= x.ToString;
  writeln(s);
end.

References