Dereference operator
In computer programming, the dereference operator or indirection operator, sometimes denoted by "*
" (i.e. an asterisk), is a unary operator (i.e. one with a single operand) found in C-like languages that include pointer variables. It operates on a pointer variable, and returns an l-value
equivalent to the value at the pointer address. This is called "dereferencing" the pointer. For example, the C code
int x; int *p; // * is used in the declaration: // p is a pointer to an integer, since (after dereferencing), // *p is an integer x = 0; // now x == 0 p = &x; // & takes the address of x // now *p == 0, since p == &x and therefore *p == x *p = 1; // equivalent to x = 1, since p == &x // now *p == 1 and x == 1
assigned 1 to variable x
by using the dereference operator and a pointer to the variable x
.
Composition
The unary * operator, as defined in C and C++, can be used in compositions in cases of multiple indirection, where multiple acts of dereferencing are required. Pointers can reference other pointers, and in such cases, multiple applications of the dereference operator are needed. Similarly, the Java dot operator can be used in compositions forming quite sophisticated statements that require substantial dereferencing of pointers behind the scenes during evaluation.
A basic example of multiple pointer indirection is the argv argument to the main function in C (and C++), which is given in the prototype as char **argv
. The name of the invoked program executable, as well as all command line arguments that followed, are stored as independent character strings. An array of pointers to char
contains pointers to the first character of each of these strings, and this array of pointers is passed to the main
function as the argv
argument. The passed array itself "decays" to a pointer, thus argv
is actually a pointer to a pointer to char
, even though it stands for an array of pointers to char
(similarly, the pointers in the array, while each formally pointing to a single char
, actually point to what are strings of characters). The accompanying main
argument, argc
, provides the size of the array (i.e. the number of strings pointed to by the elements of the array), as the size of an (outmost) array is otherwise lost when it is passed to a function and converted to a pointer. Thus, argv
is a pointer to the 0th element of an array of pointers to char
, *argv
, which in turn is a pointer to **argv
, a character (precisely, the 0th character of the first argument string, which by convention is the name of the program).
Other syntax
In BCPL, an ancestor of C, the equivalent operator was represented using an exclamation mark.
In C, the address of a structure (or union) s
is denoted by &s
. The address of operator &
is the right inverse of the dereferencing operator *
, so *&s
is equivalent to s
. (However, note that &*s
only is equivalent to s
if s
is a pointer variable; else the expression does not make sense.) The address of a structure (or union) s
may be assigned to a pointer p
:
p = &s; // the address of s has been assigned to p; p == &s; // *p is equivalent to s
The value of a member a
of a structure s
is denoted by s.a
. Given a pointer p
to s
(i.e. p == &s
), s.a
is equivalently to (*p).a
, and also to the shorthand p->a
which is syntactic sugar for accessing members of a structure (or union) through a pointer:
p = &s; // the address of s has been assigned to p; p == &s; // s.a is equivalent to (*p).a // s.a is equivalent to p->a // (*p).a is equivalent to p->a
The ->
operator can be chained; for example, in a linked list, one may refer to n->next->next
for the second following node (assuming that n->next
is not null).
In Unix shell scripting and in utilities such as Makefiles, the dollar sign "$
" is the dereference operator, used to translate the name of a variable into its contents, and is notably absent when assigning to a variable.
In Pascal, the dereference operator ^ works to both define a pointer and to dereference it. As the following example shows:
Type ComplexP = ^TComplex; (* ComplexP is a pointer type *) TComplex = record (* TComplex is a record type *) Re, Im: Integer; VAR Complex1, (* define two pointers *) Complex2: ComplexP; Complex : TComplex; (* define a record *) begin Complex.Re := 3.14159267; Complex.Im := 1.5; New(Complex1); Complex1^.Re := Complex.Re; Complex1^.Im := 3.5; New(Complex2); Complex2^ := Complex; END.
In the above example
- On line 2, the dereference operator ^ is used to define a pointer type ComplexP.
- On lines 12 and 13, values are being assigned to the Re and Im fields of the Complex record.
- On line 14, space is allocated for a TComplex record pointed to by Complex1 (New is Pascal's equivalent of C's malloc() function.)
- On Line 15, the dereference operator ^ is used to copy the value in the Re field of record Complex to the Re field of the TComplex record pointed to by Complex1.
- On line 16, the dereference operator ^ is used to assign a value to the Im field of the TComplex record pointed to by Complex1.
- On line 17, space is allocated for a TComplex record pointed to by Complex2.
- On Line 18, the entire Complex record is copied to the TComplex record pointed to by Complex2.
In various languages, prefixes are used in identifiers, known as sigils. These are not unary operators – syntactically they are lexically part of the identifier, and have different semantics, such as indicating the data type of the identifier – but are syntactically similar to the dereference operator and can be confused with it. For example, in a shell script $FOO
is the dereference operator $
applied to the variable FOO
, while in Perl $foo
is a scalar variable called foo
. In PHP, FOO is a constant (user defined or built-in), $FOO is a variable named FOO and $$FOO is a variable, whose name is stored in variable named FOO.
See also