Dereference operator

From HandWiki
Revision as of 16:34, 8 February 2024 by CodeMe (talk | contribs) (add)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Unary operator in C-like programming languages with pointers


In computer programming, the dereference operator or indirection operator, sometimes denoted by "*" (i.e. an asterisk), is a unary operator (i.e. one with a single operand) found in C-like languages that include pointer variables. It operates on a pointer variable, and returns an l-value equivalent to the value at the pointer address. This is called "dereferencing" the pointer. For example, the C code

int x;
 int *p;  // * is used in the declaration:
          // p is a pointer to an integer, since (after dereferencing),
          // *p is an integer
 x = 0;
 // now x == 0
 p = &x;  // & takes the address of x
 // now *p == 0, since p == &x and therefore *p == x
 *p = 1;  // equivalent to x = 1, since p == &x
 // now *p == 1 and x == 1

assigned 1 to variable x by using the dereference operator and a pointer to the variable x.

Composition

The unary * operator, as defined in C and C++, can be used in compositions in cases of multiple indirection, where multiple acts of dereferencing are required. Pointers can reference other pointers, and in such cases, multiple applications of the dereference operator are needed. Similarly, the Java dot operator can be used in compositions forming quite sophisticated statements that require substantial dereferencing of pointers behind the scenes during evaluation.

A basic example of multiple pointer indirection is the argv argument to the main function in C (and C++), which is given in the prototype as char **argv. The name of the invoked program executable, as well as all command line arguments that followed, are stored as independent character strings. An array of pointers to char contains pointers to the first character of each of these strings, and this array of pointers is passed to the main function as the argv argument. The passed array itself "decays" to a pointer, thus argv is actually a pointer to a pointer to char, even though it stands for an array of pointers to char (similarly, the pointers in the array, while each formally pointing to a single char, actually point to what are strings of characters). The accompanying main argument, argc, provides the size of the array (i.e. the number of strings pointed to by the elements of the array), as the size of an (outmost) array is otherwise lost when it is passed to a function and converted to a pointer. Thus, argv is a pointer to the 0th element of an array of pointers to char, *argv, which in turn is a pointer to **argv, a character (precisely, the 0th character of the first argument string, which by convention is the name of the program).

Other syntax

In BCPL, an ancestor of C, the equivalent operator was represented using an exclamation mark.

In C, the address of a structure (or union) s is denoted by &s. The address of operator & is the right inverse of the dereferencing operator *, so *&s is equivalent to s. (However, note that &*s only is equivalent to s if s is a pointer variable; else the expression does not make sense.) The address of a structure (or union) s may be assigned to a pointer p:

p = &s; // the address of s has been assigned to p; p == &s;
 // *p is equivalent to s

The value of a member a of a structure s is denoted by s.a. Given a pointer p to s (i.e. p == &s), s.a is equivalently to (*p).a, and also to the shorthand p->a which is syntactic sugar for accessing members of a structure (or union) through a pointer:

p = &s; // the address of s has been assigned to p; p == &s;
 // s.a is equivalent to (*p).a
 // s.a is equivalent to p->a
 // (*p).a is equivalent to p->a

The -> operator can be chained; for example, in a linked list, one may refer to n->next->next for the second following node (assuming that n->next is not null).

In Unix shell scripting and in utilities such as Makefiles, the dollar sign "$" is the dereference operator, used to translate the name of a variable into its contents, and is notably absent when assigning to a variable.

In Pascal, the dereference operator ^ works to both define a pointer and to dereference it. As the following example shows:

Type
    ComplexP = ^TComplex;     (* ComplexP is a pointer type *)
    TComplex = record         (* TComplex is a record type *) 
       Re,
       Im: Integer;
VAR
    Complex1,                  (* define two pointers *)     
    Complex2: ComplexP;
    Complex : TComplex;         (* define a record *)

begin
     Complex.Re := 3.14159267;    
     Complex.Im := 1.5;           
     New(Complex1);               
     Complex1^.Re := Complex.Re;  
     Complex1^.Im := 3.5;         
     New(Complex2);               
     Complex2^ := Complex;        
END.

In the above example

  • On line 2, the dereference operator ^ is used to define a pointer type ComplexP.
  • On lines 12 and 13, values are being assigned to the Re and Im fields of the Complex record.
  • On line 14, space is allocated for a TComplex record pointed to by Complex1 (New is Pascal's equivalent of C's malloc() function.)
  • On Line 15, the dereference operator ^ is used to copy the value in the Re field of record Complex to the Re field of the TComplex record pointed to by Complex1.
  • On line 16, the dereference operator ^ is used to assign a value to the Im field of the TComplex record pointed to by Complex1.
  • On line 17, space is allocated for a TComplex record pointed to by Complex2.
  • On Line 18, the entire Complex record is copied to the TComplex record pointed to by Complex2.

In various languages, prefixes are used in identifiers, known as sigils. These are not unary operators – syntactically they are lexically part of the identifier, and have different semantics, such as indicating the data type of the identifier – but are syntactically similar to the dereference operator and can be confused with it. For example, in a shell script $FOO is the dereference operator $ applied to the variable FOO, while in Perl $foo is a scalar variable called foo. In PHP, FOO is a constant (user defined or built-in), $FOO is a variable named FOO and $$FOO is a variable, whose name is stored in variable named FOO.

See also