Sigil (computer programming)

From HandWiki
Short description: Symbol affixed to a variable name

In computer programming, a sigil (/ˈsɪəl/) is a symbol affixed to a variable name, showing the variable's datatype or scope, usually a prefix, as in $foo, where $ is the sigil.

Sigil, from the Latin sigillum, meaning a "little sign", means a sign or image supposedly having magical power.[1] Sigils can be used to separate and demarcate namespaces that possess different properties or behaviors.

Historical context

The use of sigils was popularized by the BASIC programming language. The best known example of a sigil in BASIC is the dollar sign ("$") appended to the names of all strings. Consequently, programmers outside America tend to pronounce $ as "string" instead of "dollar". Many BASIC dialects use other sigils (like "%") to denote integers and floating-point numbers and their precision, and sometimes other types as well.

Larry Wall adopted shell scripting's use of sigils for his Perl programming language.[citation needed] In Perl, the sigils do not specify fine-grained data types like strings and integers, but the more general categories of scalars (using a prefixed "$"), arrays (using "@"), hashes (using "%"), and subroutines (using "&"). Raku also uses secondary sigils, or twigils,[2] to indicate the scope of variables. Prominent examples of twigils in Raku include "^" (caret), used with self-declared formal parameters ("placeholder variables"), and ".", used with object attribute accessors (i.e., instance variables).

Sigil use in some languages

In CLIPS, scalar variables are prefixed with a "?" sigil, while multifield (e.g., a 1-level list) variables are prefixed with "$?".

In Common Lisp, special variables (with dynamic scope) are typically surrounded with * in what is called the "earmuff convention". While this is only convention, and not enforced, the language itself adopts the practice (e.g., *standard-output*). Similarly, some programmers surround constants with +.

In CycL, variables are prefixed with a "?" sigil.[3] Similarly, constant names are prefixed with "#$" (pronounced "hash-dollar").[4]

In Elixir, sigils are provided via the "~" symbol, followed by a letter to denote the type of sigil, and then delimiters. For example, ~r(foo) is a regular expression of "foo". Other sigils include ~s for strings and ~D for dates. Programmers can also create their own sigils.[5]

In the esoteric INTERCAL, variables are a 16-bit integer identifier prefixed with either "." (called "spot") for 16-bit values, ":" (called "twospot") for 32-bit values, "," ("tail") for arrays of 16-bit values and ";" ("hybrid") for arrays of 32-bit values.[6] The later CLC-Intercal added "@" ("whirlpool") for a variable that can contain no value (used for classes) and "_" used to store a modified compiler.[7]

In MAPPER (aka BIS), named variables are prefixed with "<" and suffixed with ">" because strings or character values do not require quotes.

In mIRC script, identifiers have a "$" sigil, while all variables have a "%" prefixed (regardless of local or global variables or data type). Binary variables are prefixed by an "&".

In the MUMPS programming language, "$" precedes intrinsic function names and "special variable names" (built-in variables for accessing the execution state). "$Z" precedes non-standard intrinsic function names. "$$" precedes extrinsic function names. Routines (used for procedures, subroutines, functions) and global variables (database storage) are prefixed by a caret (^). The last global variable subtree may be referenced indirectly by a caret and the last subscript; this is referred to as a "naked reference". System-wide routines and global variables (stored in certain shared database(s)) are prefixed with ^%; these are referred to as "percent routines" and "percent globals".

In Objective-C, string literals preceded with "@" are instances of the object type NSString or, since clang v3.1 / LLVM v4.0, NSNumber, NSArray or NSDictionary. The prefix @ is also used on the keywords interface, implementation, and end to express the structure of class definitions. Within class declarations and definitions as well, a prefix of - is used to indicate member methods and variables, while prefix + indicates class elements.

In the PHP language, which was largely inspired by Perl, "$" precedes any variable name. Names not prefixed by this are considered constants, functions or class names (or interface or trait names, which share the same namespace as classes).

PILOT uses "$" for buffers (string variables), "#" for integer variables, and "*" for program labels.

Python uses a "__" prefix, called dunder, for "private" attributes.

In Ruby, ordinary variables lack sigils, but "$" is prefixed to global variables, "@" is prefixed to instance variables, and "@@" is prefixed to class variables. Ruby also allows (strictly conventional) suffix sigils: "?" indicates a predicate method returning a boolean or a truthy or falsy value, and "!" indicates that the method may have a potentially unexpected effect and needs to be handled with care.[8]

In Scheme, by convention, the names of procedures that always return a boolean value usually end in "?". Likewise, the names of procedures that store values into parts of previously allocated Scheme objects (such as pairs, vectors, or strings) usually end in "!".

Standard ML uses the prefix sigil "'" on a variable that refers to a type. If the sigil is doubled, it refers to a type for which equality is defined. The "'" character may also appear within or at the end of a variable, in which case it has no special meaning.

In Transact-SQL, "@" precedes a local variable or parameter name. System functions (previously known as global variables) are distinguished by a "@@" prefix. The scope of temporary tables is indicated by the prefix "#" designating local and "##" designating global.

In Windows PowerShell, which was partly inspired by Unix shells and Perl, variable names are prefixed by the "$" sigil.

In XSLT, variables and parameters have a leading "$" sigil on use, although when defined in <xsl:param> or <xsl:variable> with the "name" attribute, the sigil is not included. Related to XSLT, XQuery uses the "$" sigil form both in definition and in use.

In MEL, variable names are prefixed by "$" to distinguish them from functions, commands, and other identifiers.

Similar phenomena

Shell scripting variables

In Unix shell scripting and in utilities such as Makefiles, the "$" is a unary operator that translates the name of a variable into its contents. While this may seem similar to a sigil, it is properly a unary operator for lexical indirection, similar to the * dereference operator for pointers in C, as noticeable from the fact that the dollar sign is omitted when assigning to a variable.

Identifier conventions

In Fortran, sigils are not used, but all variables starting with the letters I, J, K, L, M and N are integers by default. Fortran documentation refers to this as "implicit typing". Explicit typing is also available to allow any variable to be declared with any type.

Various programming languages including Prolog, Haskell, Ruby and Go treat identifiers beginning with a capital letter differently from identifiers beginning with a small letter, a practice related to the use of sigils.

Stropping

Actually a form of stropping, the use of many languages in Microsoft's .NET Common Language Infrastructure (CLI) requires a way to use variables in a different language that may be keywords in a calling language. This is sometimes done by prefixes. In C#, any variable names may be prefixed with "@". This is mainly used to allow the use of variable names that would otherwise conflict with keywords.[9] The same is achieved in VB.Net by enclosing the name in square brackets, as in [end].[10]

The "@" prefix can also be applied to string literals; see literal affixes below.

Hungarian notation

Related to sigils is Hungarian notation, a naming convention for variables that specifies variable type by attaching certain alphabetic prefixes to the variable name. Unlike sigils, however, Hungarian notation provides no information to the compiler; as such, explicit types must be redundantly specified for the variables (unless using a language with type inference). As most standard compilers do not enforce use of the prefixes, this permits omission and also makes code prone to confusion due to accidental erroneous use.[11]

Literal affixes

While sigils are applied to names (identifiers), similar prefixes and suffixes can be applied to literals, notably integer literals and string literals, specifying either how the literal should be evaluated, or what data type it is. For example, 0x10ULL evaluates to the value 16 as an unsigned long long integer in C++: the 0x prefix indicates hexadecimal, while the suffix ULL indicates unsigned long long. Similarly, prefixes are often used to indicate a raw string, such as r"C:\Windows" in Python, which represents the string with value C:\Windows; as an escaped string this would be written as "C:\\Windows".

As this affects the semantics (value) of a literal, rather than the syntax or semantics of an identifier (name), this is neither stropping (identifier syntax) nor a sigil (identifier semantics), but it is syntactically similar.

Java annotations

Compare Java annotations such as @Override and @Deprecated.

Confusion

In some cases the same syntax can be used for distinct purposes, which can cause confusion. For example, in C#, the "@" prefix can be used either for stropping (to allow reserved words to be used as identifiers), or as a prefix to a literal (to indicate a raw string); in this case neither use is a sigil, as it affects the syntax of identifiers or the semantics of literals, not the semantics of identifiers.

See also

References