Value type and reference type

From HandWiki

In certain computer programming languages, data types are classified as either value types or reference types, where reference types are always implicitly accessed via references, whereas value type variables directly contain the values themselves.[1][2]

Properties of value types and reference types

Even among languages that have this distinction, the exact properties of value and reference types vary from language to language, but typical properties include:

  • Primitive data types, such as Booleans, fixed-size integers, floating-point values, and characters, are value types.
  • Objects, in the sense of object-oriented programming, belong to reference types.
  • Assigning to a variable of reference type simply copies the reference, whereas assigning to a variable of value type copies the value. This applies to all kinds of variables, including local variables, fields of objects, and array elements. Likewise when calling a function: parameters of reference type are copies of the reference, whereas parameters of value type are copies of the value.
  • If a reference type is mutable, then mutations made via one reference are visible via any other, whereas if a value type is mutable, then mutations made to one value are not visible in another.
  • Reference types support the notion of identity — it makes sense to discuss whether two values of reference type refer to the same object, and the language provides functionality to determine whether they do — whereas value types do not.
  • Null belongs to every reference type; that is, a value of reference type may be null rather than a reference to an object.
  • Values of reference type refer to objects allocated in the heap, whereas values of value type are contained either on the call stack (in the case of local variables and function parameters) or inside their containing entities (in the case of fields of objects and array elements). (With reference types, it is only the reference itself that is contained either on the call stack or inside a containing entity.)
  • Reference types support the notion of subtyping, whereby all values of a given reference type are automatically values of a different reference type. Value types do not support subtyping, but may support other forms of implicit type conversion, e.g. automatically converting an integer to a floating-point number if needed. Additionally, there may be implicit conversions between certain value and reference types, e.g. "boxing" a primitive
Computer code
int
(a value type) into an 
Computer code
Integer
object (an object type), or reversing this via "unboxing".

Reference types and "call by sharing"

Even when function arguments are passed using "call by value" semantics (which is always the case in Java, and is the case by default in C#), a value of a reference type is intrinsically a reference; so if a parameter belongs to a reference type, the resulting behavior bears some resemblance to "call by reference" semantics. This behavior is sometimes called call by sharing.

Call by sharing resembles call by reference in the case where a function mutates an object that it received as an argument: when that happens, the mutation will be visible to the caller as well, because the caller and the function have references to the same object. It differs from call by reference in the case where a function assigns its parameter to a different reference; when that happens, this assignment will not be visible to the caller, because the caller and the function have separate references, even though both references initially point to the same object.

Reference types vs. explicit pointers

Many languages have explicit pointers or references. Reference types differ from these in that the entities they refer to are always accessed via references; for example, whereas in C++ it's possible to have either a

std::string

and a

std::string *

, where the former is a mutable string and the latter is an explicit pointer to a mutable string (unless it's a null pointer), in Java it is only possible to have a

Computer code
StringBuilder

, which is implicitly a reference to a mutable string (unless it's a null reference).

While C++'s approach is more flexible, use of non-references can lead to problems such as object slicing, at least when inheritance is used; in languages where objects belong to reference types, these problems are automatically avoided, at the cost of removing some options from the programmer.

Classification per language

Language Value type Reference type
Java[3] all non-object types, including (e.g.) booleans and numbers all object types, including (e.g.) arrays
C#[4] all non-object types, including structures and enumerations as well as primitive types all object-types, including both classes and interfaces
Swift[5][6] structures (including e.g. booleans, numbers, strings, and sets) and enumerations (including e.g. optionals) functions, closures, classes
Python[7] all types
JavaScript[8] all non-objects, including booleans, floating-point numbers, and strings, among others all objects, including functions and arrays, among others
OCaml[9][10] immutable characters, immutable integer numbers, immutable floating-point numbers, immutable tuples, immutable enumerations (including immutable units, immutable booleans, immutable lists, immutable optionals), immutable exceptions, immutable formatting strings arrays, immutable strings, byte strings, dictionaries

See also

References