IBM hexadecimal floating point

From HandWiki
(Redirected from Base-16 floating point)
Short description: Number representation

IBM System/360 computers, and subsequent machines based on that architecture (mainframes), support a hexadecimal floating-point format (HFP).[1][2][3]

In comparison to IEEE 754 floating-point, the IBM floating-point format has a longer significand, and a shorter exponent. All IBM floating-point formats have 7 bits of exponent with a bias of 64. The normalized range of representable numbers is from 16−65 to 1663 (approx. 5.39761 × 10−79 to 7.237005 × 1075).

The number is represented as the following formula: (−1)sign × 0.significand × 16exponent−64.

Single-precision 32-bit

A single-precision binary floating-point number (called "short" by IBM) is stored in a 32-bit word:

1 7 24 (width in bits)
S Exp Fraction  
31 30 ... 24 23 ... 0 (bit index)*
* IBM documentation numbers the bits from left to right, so that the most significant bit is designated as bit number 0.

In this format the initial bit is not suppressed, and the radix point is set to the left of the significand (fraction in IBM documentation and the figures) in increments of 4 bits.

Since the base is 16, the exponent in this form is about twice as large as the equivalent in IEEE 754, in order to have similar exponent range in binary, 9 exponent bits would be required.

Example

Consider encoding the value −118.625 as an IBM single-precision floating-point value.

The value is negative, so the sign bit is 1.

The value 118.62510 in binary is 1110110.1012. This value is normalized by moving the radix point left four bits (one hexadecimal digit) at a time until the leftmost digit is zero, yielding 0.011101101012. The remaining rightmost digits are padded with zeros, yielding a 24-bit fraction of .0111 0110 1010 0000 0000 00002.

The normalized value moved the radix point two digits to the left, yielding a multiplier and exponent of 16+2. A bias of +64 is added to the exponent (+2), yielding +66, which is 100 00102.

Combining the sign, exponent plus bias, and normalized fraction produces this encoding:

S Exp Fraction  
1 100 0010 0111 0110 1010 0000 0000 0000  

In other words, the number represented is −0.76A00016 × 1666 − 64 = −0.4633789… × 16+2 = −118.625

Largest representable number

S Exp Fraction  
0 111 1111 1111 1111 1111 1111 1111 1111  

The number represented is +0.FFFFFF16 × 16127 − 64 = (1 − 16−6) × 1663 ≈ +7.2370051 × 1075

Smallest positive normalized number

S Exp Fraction  
0 000 0000 0001 0000 0000 0000 0000 0000  

The number represented is +0.116 × 160 − 64 = 16−1 × 16−64 ≈ +5.397605 × 10−79.

Zero

S Exp Fraction  
0 000 0000 0000 0000 0000 0000 0000 0000  

Zero (0.0) is represented in normalized form as all zero bits, which is arithmetically the value +0.016 × 160 − 64 = +0 × 16−64 ≈ +0.000000 × 10−79 = 0. Given a fraction of all-bits zero, any combination of positive or negative sign bit and a non-zero biased exponent will yield a value arithmetically equal to zero. However, the normalized form generated for zero by CPU hardware is all-bits zero. This is true for all three floating-point precision formats.

Precision issues

Since the base is 16, there can be up to three leading zero bits in the binary significand. That means when the number is converted into binary, there can be as few as 21 bits of precision. Because of the "wobbling precision" effect, this can cause some calculations to be very inaccurate.

A good example of the inaccuracy is representation of decimal value 0.1. It has no exact binary or hexadecimal representation. In hexadecimal format, it is represented as 0.19999999...16 or 0.0001 1001 1001 1001 1001 1001 1001...2, that is:

S Exp Fraction  
0 100 0000 0001 1001 1001 1001 1001 1010  

This has only 21 bits, whereas the binary version has 24 bits of precision.

Six hexadecimal digits of precision is roughly equivalent to six decimal digits (i.e. (6 − 1) log10(16) ≈ 6.02). A conversion of single precision hexadecimal float to decimal string would require at least 9 significant digits (i.e. 6 log10(16) + 1 ≈ 8.22) in order to convert back to the same hexadecimal float value.

Double-precision 64-bit

The double-precision floating-point format (called "long" by IBM) is the same as the "short" format except that the fraction field is wider and the double-precision number is stored in a double word (8 bytes):

1 7 56 (width in bits)
S Exp Fraction  
63 62 ... 56 55 ... 0 (bit index)*
* IBM documentation numbers the bits from left to right, so that the most significant bit is designated as bit number 0.

The exponent for this format covers only about a quarter of the range as the corresponding IEEE binary format.

14 hexadecimal digits of precision is roughly equivalent to 17 decimal digits. A conversion of double precision hexadecimal float to decimal string would require at least 18 significant digits in order to convert back to the same hexadecimal float value.

Extended-precision 128-bit

Called extended-precision by IBM, a quadruple-precision floating-point format was added to the System/370 series and was available on some S/360 models (S/360-85, -195, and others by special request or simulated by OS software). The extended-precision fraction field is wider, and the extended-precision number is stored as two double words (16 bytes):

High-order part
1 7 56 (width in bits)
S Exp Fraction (high-order 14 digits)  
127 126 ... 120 119 ... 64 (bit index)*
Low-order part
8 56 (width in bits)
Unused Fraction (low-order 14 digits)  
63 ... 56 55 ... 0 (bit index)*
* IBM documentation numbers the bits from left to right, so that the most significant bit is designated as bit number 0.

28 hexadecimal digits of precision is roughly equivalent to 32 decimal digits. A conversion of extended precision hexadecimal float to decimal string would require at least 35 significant digits in order to convert back to the same hexadecimal float value.

Arithmetic operations

Most arithmetic operations truncate like simple pocket calculators. Therefore, 1 − 16−7 = 1. In this case, the result is rounded away from zero.[4]

IEEE 754 on IBM mainframes

Starting with the S/390 G5 in 1998,[5] IBM mainframes have also included IEEE binary floating-point units which conform to the IEEE 754 Standard for Floating-Point Arithmetic. IEEE decimal floating-point was added to IBM System z9 GA2[6] in 2007 using millicode[7] and in 2008 to the IBM System z10 in hardware.[8]

Modern IBM mainframes support three floating-point radices with 3 hexadecimal (HFP) formats, 3 binary (BFP) formats, and 3 decimal (DFP) formats. There are two floating-point units per core; one supporting HFP and BFP, and one supporting DFP; there is one register file, FPRs, which holds all 3 formats. Starting with the z13 in 2015, processors have added a vector facility that includes 32 vector registers, each 128 bits wide; a vector register can contain two 64-bit or four 32-bit floating-point numbers.[9] The traditional 16 floating-point registers are overlaid on the new vector registers so some data can be manipulated with traditional floating-point instructions or with the newer vector instructions.

Special uses

The IBM floating-point format is used in:

  • SAS 5 Transport files (.XPT) as required by the Food and Drug Administration (FDA) for New Drug Application (NDA) study submissions,[10]
  • GRIB (GRIdded Binary) data files to exchange the output of weather prediction models (IEEE single-precision floating-point format in current version),
  • GDS II (Graphic Database System II) format files (OASIS is the replacement), and
  • SEG Y (Society of Exploration Geophysicists Y) format files (IEEE single-precision floating-point was added to the format in 2002).[11]

As IBM is the only remaining provider of hardware (and only in their mainframes) using their non-standard floating-point format, no popular file format requires it; Except the FDA requires the SAS file format and "All floating-point numbers in the file are stored using the IBM mainframe representation. [...] Most platforms use the IEEE representation for floating-point numbers. [...] To assist you in reading and/or writing transport files, we are providing routines to convert from IEEE representation (either big endian or little endian) to transport representation and back again."[10] Code for IBM's format is also available under LGPLv2.1.[12]

Systems that use the IBM floating-point format

See also

References

  1. IBM System/360 Principles of Operation, IBM Publication A22-6821-6, Seventh Edition (January 13, 1967), pp.41-50
  2. IBM System/370 Principles of Operation, IBM Publication GA22-7000-4, Fifth Edition (September 1, 1975), pp.157-170
  3. z/Architecture Principles of Operation, IBM Publication SA22-7832-01, Second Edition (October, 2001), chapter 9 ff.
  4. ESA/390 Enhanced Floating Point Support: An Overview
  5. Schwarz, E. M.; Krygowski, C. A. (September 1999). "The S/390 G5 floating-point unit". IBM Journal of Research and Development 43 (5.6): 707–721. doi:10.1147/rd.435.0707. 
  6. Duale, A. Y.; Decker, M. H.; Zipperer, H.-G.; Aharoni, M.; Bohizic, T. J. (January 2007). "Decimal floating-point in z9: An implementation and testing perspective". IBM Journal of Research and Development 51 (1.2): 217–227. doi:10.1147/rd.511.0217. 
  7. Heller, L. C.; Farrell, M. S. (May 2004). "Millicode in an IBM zSeries processor". IBM Journal of Research and Development 48 (3.4): 425–434. doi:10.1147/rd.483.0425. 
  8. Schwarz, E. M.; Kapernick, J. S.; Cowlishaw, M. F. (January 2009). "Decimal floating-point support on the IBM System z10 processor". IBM Journal of Research and Development 53 (1): 4:1–4:10. doi:10.1147/JRD.2009.5388585. 
  9. z/Architecture Principles of Operation
  10. 10.0 10.1 "The Record Layout of a Data Set in SAS Transport (XPORT) Format". http://support.sas.com/techsup/technote/ts140.pdf. 
  11. http://www.seg.org/documents/10161/77915/seg_y_rev1.pdf
  12. https://cran.r-project.org/web/packages/SASxport/SASxport.pdf

Further reading