Skip to main content

Data Storage Converters — bytes, KB, MB, GB, TB

Last updated:

Data-storage conversions are the most modern category in everyday unit work because the underlying units did not exist before the 1950s, but they are also the most quietly contentious because two competing conventions — decimal SI (1 GB = 1000 MB) and binary IEC (1 GiB = 1024 MiB) — produce different numeric results from the same byte count. The category covers six core units: the bit (the fundamental binary digit, the smallest unit of digital information), the byte (8 bits, the standard addressable unit in modern computer architectures), and the four scale prefixes — kilobyte, megabyte, gigabyte, and terabyte — that carry the storage figures everyday users actually see on cloud-storage dashboards, hard-drive labels, and operating-system file browsers. The scale prefixes operate in either decimal SI or binary IEC convention depending on the context: cloud-storage vendors and HDD/SSD manufacturers use decimal SI throughout, while many operating-system file-size displays use binary IEC. The 7–10% discrepancy per scale step explains why a "1 TB" drive shows as 931 GB on most computer desktops, which is the most common consumer confusion in the category. Cross-system reconciliation between bytes-on-disk and unit-display conventions runs at every storage-purchase, every backup-sizing exercise, and every cloud-tier upgrade.

Units in this category

Bytes (B)

One byte equals exactly 8 bits, encoding 2⁸ = 256 distinct values (0–255 unsigned, or −128 to +127 in two's-complement signed representation). The byte is the smallest individually-addressable unit of memory in essentially all modern computer architectures, and is the unit in which file sizes, memory capacities, and storage media capacities are reported. The IEC formal name for the 8-bit byte is the "octet" (IEC 60027-2:2005, IEC 80000-13:2008), used in international standards documents and in networking RFCs to disambiguate from historical machines where "byte" meant other widths.

Kilobytes (KB)

One kilobyte (KB) equals 1,000 bytes under the SI decimal convention or 1,024 bytes (= 2¹⁰) under the historical binary convention used by operating systems and embedded-firmware tooling. The IEC 80000-13:2008 standard introduced the kibibyte (KiB) as the unambiguous name for 1,024 bytes, leaving "kilobyte" formally restricted to the decimal 1,000-byte meaning, but adoption outside Linux distributions and standards-conscious documentation has been limited. In practice, storage manufacturers and network engineers use 1 KB = 1,000 bytes; Microsoft Windows file managers and most legacy desktop software use 1 KB = 1,024 bytes; the resulting 2.4% gap is small at the kilobyte level but propagates upward — at the megabyte level the gap reaches 4.9%, at gigabyte 7.4%, and at terabyte 10.0%, the source of the famous "my 1 TB drive only shows 931 GB" complaint covered under the byte and tb entries.

Megabytes (MB)

One megabyte (MB) equals 1,000,000 bytes under the SI decimal convention or 1,048,576 bytes (= 2²⁰) under the historical binary convention. The IEC 80000-13:2008 standard names the binary 1,048,576-byte quantity the mebibyte (MiB), reserving "megabyte" for the decimal value, but consumer software, file managers, and most desktop operating systems before 2009 reported 1 MB = 1,048,576 bytes. The 4.9% gap between the two conventions is roughly twice the kilobyte-level gap and noticeable on any storage label: a 700 MB CD-ROM holds 734,003,200 bytes if "MB" is read as binary mebibytes, or 700,000,000 bytes if read as decimal megabytes — and CD-ROM capacities were originally specified in binary mebibytes, the source of every "but my disc shows 698 MB free" report from the CD-burning era.

Gigabytes (GB)

One gigabyte (GB) equals 1,000,000,000 bytes (= 10⁹) under the SI decimal convention or 1,073,741,824 bytes (= 2³⁰) under the historical binary convention used by Microsoft Windows file managers and most pre-2009 operating-system tooling. The IEC 80000-13:2008 standard names the binary 1,073,741,824-byte quantity the gibibyte (GiB), reserving "gigabyte" for the decimal 10⁹ value. The 7.4% gap between the two conventions is now the consumer-visible source of the "my 128 GB iPhone only shows 119 GB available" pattern — Apple labels device capacity in decimal GB matching the SSD vendor's marketed capacity, and a 128 × 10⁹-byte drive read under binary GiB conventions reports 128,000,000,000 ÷ 1,073,741,824 ≈ 119.2 GiB.

Terabytes (TB)

One terabyte (TB) equals 1,000,000,000,000 bytes (= 10¹²) under the SI decimal convention or 1,099,511,627,776 bytes (= 2⁴⁰) under the historical binary convention. The IEC 80000-13:2008 standard names the binary 2⁴⁰-byte quantity the tebibyte (TiB), reserving "terabyte" for the decimal 10¹² value, and the gap between the two is now 9.95% — the largest at any prefix level the consumer encounters routinely. The terabyte is the dominant unit for consumer secondary storage (mechanical hard drives, internal and external SSDs, network-attached storage), for cloud-storage paid tiers above the gigabyte free-tier ceiling, and for video-production and surveillance-archival capacity planning.

Bits (bit)

One bit is the information content of a single binary digit — equivalently, the Shannon entropy of an outcome with probability ½, such as a fair coin flip whose result is then revealed. Formally, for a discrete random variable X with probability mass function p(x), the Shannon entropy is H(X) = −Σ p(x) log₂ p(x) bits. The choice of log base 2 fixes the unit as the bit; log base e gives the nat (natural unit, ≈ 1.443 bits); log base 10 gives the hartley or ban (≈ 3.322 bits).

History of data storage measurement

The bit was named by John Tukey in 1948 (as a contraction of "binary digit") and adopted as a unit by Claude Shannon in his foundational 1948 paper on information theory. The byte as 8 bits was standardised by IBM's 1964 System/360 architecture, which set the modern convention; earlier computer architectures used variable-width bytes ranging from 6 to 9 bits depending on the machine. The kilobyte was originally ambiguous — early computer engineers used "1024" for the binary 2^10 because it was close to the decimal "1000" and matched the underlying memory-address binary structure, while marketing and external documentation gradually shifted toward the strict decimal 1000. The IEC introduced the binary prefixes (kibibyte/KiB, mebibyte/MiB, gibibyte/GiB) in 1999 to disambiguate, and the modern best-practice convention is to use SI prefixes (KB, MB, GB) strictly for decimal and IEC prefixes (KiB, MiB, GiB) for binary, though informal usage still mixes them. Modern HDDs label capacity in decimal SI; modern SSDs vary by manufacturer; cloud-storage dashboards uniformly use decimal SI.

Where data storage conversions matter

Data-storage conversions appear at every layer of the modern computing stack. Cloud-storage services (AWS S3, Azure Blob, Google Cloud Storage, iCloud, Google Drive, Dropbox, OneDrive) sell capacity in decimal-SI tiers (50 GB, 200 GB, 2 TB, 10 TB) and bill against decimal-SI usage figures, with the conversion running constantly in dashboard displays and per-tenant audit reports. Hard-drive and SSD manufacturers (Seagate, WD, Samsung, Crucial) label product capacity in decimal SI on the box (1 TB, 4 TB, 16 TB) but the operating system displays the same byte count in binary IEC, producing the well-known consumer confusion. Email systems allocate per-mailbox storage in GB tiers but accumulate per-message MB-level storage, with email-archive aggregation running constantly in IT operations dashboards. Software deployment teams roll up per-package MB into per-image GB to size endpoint backups and disk-imaging logistics. Photo and video professionals manage per-shoot GB folders in multi-TB archive arrays, and database administrators track per-table GB against per-database TB capacity plans. Streaming-media services aggregate per-asset GB encodes against TB-scale CDN inventories with per-region capacity tracking. NAS and SAN administrators provision TB-rated arrays into GB-tier user shares, with backup engineers planning TB-scale tape inventories against GB-scale per-restore chunks. The conversion is universal across enterprise IT, consumer SaaS, creative-professional workflows, and modern data-engineering pipelines.

How to convert data storage units

In the decimal SI interpretation used by cloud vendors and most modern dashboards, each scale step is exactly 1000: 1 KB = 1000 bytes, 1 MB = 1000 KB = 1,000,000 bytes, 1 GB = 1,000,000,000 bytes, 1 TB = 1,000,000,000,000 bytes. In the binary IEC interpretation, each scale step is 1024 (2^10): 1 KiB = 1024 bytes, 1 MiB = 1,048,576 bytes, 1 GiB = 1,073,741,824 bytes, 1 TiB = 1,099,511,627,776 bytes. The two conventions diverge by about 2.4% at each scale step, accumulating to about 9.5% by the TB scale — a 1 TB drive contains 1 trillion bytes (decimal SI), which the OS reports as 0.909 TiB or 931 GiB after binary conversion. The bit-to-byte conversion is fixed at 8 bits per byte. Confusing decimal and binary in the same calculation introduces percent-level errors; modern best practice is to use decimal SI throughout for storage capacity and to mark binary explicitly with the IEC prefix when the binary interpretation is intended.

All data storage conversions

Frequently asked questions

Why does my 1 TB hard drive show as 931 GB on my computer?

Hard-drive manufacturers label capacity in decimal SI (1 TB = 1,000,000,000,000 bytes), while many operating systems including Windows and earlier macOS versions display the same byte count using the binary IEC interpretation (1 GiB = 1,073,741,824 bytes). A 1 TB drive contains 1 trillion bytes, which the OS displays as 0.909 TiB or 931 GiB after binary conversion. The drive is not missing capacity; the two displays use different unit conventions for the same underlying bytes.

What is the difference between MB and MiB?

MB (megabyte) is a decimal SI unit equal to exactly 1,000,000 bytes, while MiB (mebibyte) is a binary IEC unit equal to exactly 1,048,576 bytes (2^20). The two units differ by about 4.9% — a 100 MB file is about 95.4 MiB. The IEC introduced the MiB notation in 1999 specifically to disambiguate the decimal and binary interpretations; modern best practice uses MB for decimal and MiB for binary, though informal usage still mixes them.

How many bits in a byte?

A byte is exactly 8 bits in modern computer architecture, a convention set by IBM's 1964 System/360 design and universal since. Earlier computer systems used variable-width bytes ranging from 6 to 9 bits, but the 8-bit byte became standard because it efficiently encodes a single character of extended ASCII text, two decimal digits in BCD, or a single 8-bit binary value. The 8-bit byte underlies all modern memory, file-system, and network-protocol specifications.

Why do cloud-storage vendors use decimal SI instead of binary?

Decimal SI gives cleaner numbers in marketing and billing displays — "2 TB" is more readable than "1.86 TiB" for the same capacity tier, and decimal-SI prefixes align with the SI system used elsewhere in scientific and commercial measurement. Cloud providers including AWS, Azure, Google Cloud, and consumer services like iCloud and Google One have all adopted decimal SI throughout their dashboards, billing, and product naming. The decision is also consistent with HDD and SSD manufacturer labelling, simplifying capacity-planning conversations.

What is a kilobit and how is it different from a kilobyte?

A kilobit is 1000 bits in decimal SI (or 1024 bits in older binary usage), abbreviated as "kb" with a lowercase b. A kilobyte is 1000 bytes in decimal SI, abbreviated as "KB" with an uppercase B. The two units differ by a factor of 8 because there are 8 bits per byte. Kilobits typically appear in network-bandwidth contexts (where bits-per-second is the historical unit), while kilobytes appear in storage and file-size contexts.

How precise should storage-capacity conversions be?

For consumer dashboards and casual reporting, two significant figures are sufficient (e.g. "2.0 TB" or "750 GB"), and most user interfaces round automatically. For enterprise capacity planning, three significant figures preserve the precision needed for tier-upgrade decisions and contract negotiations. For technical metrology and storage-system testing, the underlying byte count is the authoritative figure, with the unit display being a presentation choice rather than a precision constraint.

When should I use binary IEC prefixes (KiB, MiB, GiB)?

Use binary IEC prefixes whenever the underlying figure is genuinely measured in powers of 1024 — RAM capacity (which is allocated in 2^n increments by hardware design), some operating-system file-size displays, and SSD-interface bandwidth specs in some technical documents. Use decimal SI prefixes (KB, MB, GB, TB) for HDD and SSD capacity, cloud-storage tiers, network bandwidth, and any context where the round-number figure is the marketing or contract reference. When in doubt, the IEC prefix is the safer choice for binary contexts because it removes ambiguity.

Related categories