Jump to content

Programming Fundamentals/Floating-Point Data Type

From Wikibooks, open books for an open world

Overview

[edit | edit source]

A floating-point data type uses a common representation of real numbers as an approximation, which is essentially a trade-off between range and precision. For this reason, floating-point computation is often found in systems that include very small and very large real numbers, which require fast processing times. A number is, in general, represented approximately to a fixed number of significant digits and scaled using an exponent in some fixed base such as 10.[1]

Discussion

[edit | edit source]

The floating-point data type is a family of data types that act alike and differ only in the size of their domains (the allowable values). The floating-point family of data types represents number values with fractional parts. They are technically stored as two integer values: a mantissa and an exponent. The floating-point family has the same attributes and acts or behaves similarly in all programming languages. They can always store negative or positive values thus they always are signed; unlike the integer data type that could be unsigned. The domain for floating-point data types varies because they could represent very large numbers or very small numbers. Rather than talk about the actual values, we mention the precision. The more bytes of storage the larger the mantissa and exponent, thus more precision.

Language Reserved Word Size Precision Range
C++ float 32 bits / 4 bytes 7 decimal digits ±3.40282347E+38
C++ double 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308
C# float 32 bits / 4 bytes 7 decimal digits ±3.40282347E+38
C# double 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308
Java float 32 bits / 4 bytes 7 decimal digits ±3.40282347E+38
Java double 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308
JavaScript Number 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308
Python float() 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308
Swift Float 32 bits / 4 bytes 7 decimal digits ±3.40282347E+38
Swift Double 64 bits / 8 bytes 15 decimal digits ±1.79769313486231570E+308

When converting operations with floating-point values, there may be more decimal places than you want. We can use the round function to limit the number of decimal places displayed. For example, round(1.12356,2) gives 1.12.[2]

Key Terms

[edit | edit source]
double
The most often used floating-point family data type used.
mantissa exponent
The two integer parts of a floating-point value.
precision
The effect on the domain of floating-point values given a larger or smaller storage area in bytes.

References

[edit | edit source]