Software:C character classification

Short description: Operations in the C standard library that classify characters

C character classification is a group of operations in the C standard library that test a character for membership in a particular class of characters; such as alphabetic, control, etc. Both single-byte, and wide characters are supported.^[1]

History

Early C programmers working on the Unix operating system developed programming idioms for classifying characters. For example, the following code evaluates as true for an ASCII letter character c:

('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z')

Eventually, the interface to common character classification functionality was codified in the C standard library file ctype.h.

Implementation

For performance, the standard character classification functions are usually implemented as macros instead of functions. But, due to limitations of macro evaluation, they are generally not implemented today as they were in early versions of Linux like:

#define isdigit(c) ((c) >= '0' && (c) <= '9')

This can lead to an error when the macro parameter x is expanded to an expression with a side effect; for example: isdigit(x++). If the implementation was a function, then x would be incremented only once. But for this macro definition it is incremented twice.

To eliminate this problem, a common implementation is for the macro to use table lookup. For example, the standard library provides an array of 256 integers – one for each character value – that each contain a bit-field for each supported classification. A macro references an integer by character value index and accesses the associated bit-field. For example, if the low bit indicates whether the character is a digit, then the isdigit macro could be written as:

#define isdigit(c) (TABLE[c] & 1)

The macro argument, c, is referenced only once, so is evaluated only once.

Overview of functions

The functions that operate on single-byte characters are defined in ctype.h header file (cctype in C++). The functions that operate on wide characters are defined in wctype.h header file (cwctype in C++).

The classification is evaluated according to the effective locale.

Byte character	Wide character	Description
`isalnum`	`iswalnum`	checks whether the operand is alphanumeric
`isalpha`	`iswalpha`	checks whether the operand is alphabetic
`islower`	`iswlower`	checks whether the operand is lowercase
`isupper`	`iswupper`	checks whether the operand is an uppercase
`isdigit`	`iswdigit`	checks whether the operand is a digit
`isxdigit`	`iswxdigit`	checks whether the operand is hexadecimal
`iscntrl`	`iswcntrl`	checks whether the operand is a control character
`isgraph`	`iswgraph`	checks whether the operand is a graphical character
`isspace`	`iswspace`	checks whether the operand is space
`isblank`	`iswblank`	checks whether the operand is a blank space character
`isprint`	`iswprint`	checks whether the operand is a printable character
`ispunct`	`iswpunct`	checks whether the operand is punctuation
`tolower`	`towlower`	converts the operand to lowercase
`toupper`	`towupper`	converts the operand to uppercase
N/A	`iswctype`	checks whether the operand falls into specific class
N/A	`towctrans`	converts the operand using a specific mapping
N/A	`wctype`	returns a wide character class to be used with `iswctype`
N/A	`wctrans`	returns a transformation mapping to be used with `towctrans`

References

↑ ISO/IEC 9899:1999 specification. p. 193, § 7.4. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf.

External links

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/C character classification. Read more

[1] ISO/IEC 9899:1999 specification. p. 193, § 7.4. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf.

[1]

v t e C programming language
ANSI C C89 and C90 C99 C11 C18 C2x Embedded C MISRA C
Features	Functions Header files Libraries Operators String Syntax Preprocessor Data types
Standard library functions	Char (ctype.h) File I/O (stdio.h) Math (math.h) Dynamic memory (stdlib.h) String (string.h) Time (time.h) Variadic (stdarg.h) POSIX
Standard libraries	Bionic libhybris dietlibc EGLIBC glibc klibc Microsoft Run-time Library musl Newlib uClibc BSD libc
Compilers	Comparison of compilers ACK Borland Turbo C Clang GCC ICC LCC PCC SDCC TCC Microsoft Visual Studio / Express / C++ Watcom C/C++
IDEs	Comparison of IDEs Anjuta Code CodeLite Eclipse Geany Microsoft Visual Studio NetBeans
Comparison with other languages	Compatibility of C and C++ Comparison with Embedded C Comparison with Pascal Comparison of programming languages
Descendant languages	C++ C# D Objective-C Alef Limbo Go Vala
Category

Anonymous

Search

Software:C character classification

Namespaces

More

Page actions

Contents

History

Implementation

Overview of functions

References

External links

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Software:C character classification

History

Implementation

Overview of functions

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories