Demystifying Cache–From Bytes to Tags

8 minute read

Published: May 02, 2025

Breaking down cache‐address bits is one of the trickiest parts of PS 6. In this review, we’ll step through in detail how to partition a 32-bit address into different fields. Then we’ll work through a sample address problem and cap things off with a challenge problem to test your knowledge :)

There are multiple ways to talk about this. Below I’ll assemble those pieces step by step. If you’re already comfortable with the address breakdown, feel free to skip ahead to the practice problems to test your knowledge!

Cache and Address Space
Cache Structure and Address Breakdown
- Byte Offset: bytes in a word
- Block Offset: words in a block
- Set index: blocks in a way
- Mapping One-way Cache to Memory
- Multi-way Cache and Associativity
Practice Dividing Address Bits
Challenge Problem
Cache and Address Space

Be careful with units! They may appear as bit (b), byte (B), or word!

Our MIPS memory uses 32-bit addresses (0x00000000 through 0xFFFFFFFF in HEX). Each byte in memory has its own unique 32‑bit address.
A word in MIPS is 4 bytes (32 bits). We address by byte, but instructions and loads/stores transfer entire words.

Cache Structure

A typical cache has the following structure:

Whole Cache
  └── Ways (if associative)
        └── Blocks
              └── Words
                    └── Bytes

Let’s examine each level bottom-up

Don’t confuse this nested structure with the cache “hierarchy” of L1/L2/L3 caches, where multiple caches of different sizes and speeds hold different subsets of data.

Byte Offset

The byte offset selects bytes within a 4-byte word.

It takes $\log_2(4) = 2$ bits to represent the 4-byte addresses: 00, 01, 10, 11

Block Offset

Caches read/write in blocks, which may contain multiple words. To locate bytes within a block, we break the low-order address bits into two parts.

For example, if each block is 8 bytes, it holds 2 words. To index within a block of size $8$ bytes, we need $$\log_2(8) = 3$ bits total.

Block offset (which word in the block, navy): $log_2\frac{\mathrm{block\,size}}{\mathrm{word\,size}} = log_2\frac{8B}{4B} = 1 \mathrm{bit}$
Byte offset (which byte in the word, blue): $log_2\frac{\mathrm{word\, size}}{\mathrm{byte\, size}} = log_2\frac{4B}{2B} = 2 \mathrm{bits}$
- Since transfers are word-aligned, those two bits are always 00

Some texts combine the block-offset and byte-offset bits into a single “offset” field.

Set index

In a cache way, each block lives in exactly one “set.” If each cache way has W bytes and each block is B bytes, then we have

Number of blocks $= \frac{W}{B}$
Set index bits $= log_2\frac{W}{B}$

Suppose our cache way has 4 total blocks (8 words, 32 bytes). The address range is 00000 - 11111. The set index will be the two leading bits.

For instance, the highlighted word has address 01 1 00 in the cache, where

01 (2 bits, red): set index
1 (1 bit, navy): block offset
00 (2 bits, blue): byte offset

Mapping One-way Cache to Memory

Main memory is huge (32-bit addresses), but caches are small (ours has a 5-bit address space). A cache acts like a small set of “drawers” that hold recent data. To cache a memory address:

Take the lower 5 bits (set index, block offset, byte offset) to map to the appropriate drawer
- Since we cache an entire block (spatial locality, fill the entire drawer), the byte and block offsets within a block won’t matter – We always know it’s cached
- Therefore, we only need to compare the set index
The remaining 27 bits form the tag, identifying which region of main memory is stored in that drawer.

When looking up a cache:

Use the set index bits to open the correct drawer (set)
Compare the stored tag in that drawer to the high-order 27 bits of the address.
- If they match, it’s a hit, and you read the byte/word.
- If they don’t match or the drawer is empty, it’s a miss: fetch the entire block from main memory into that drawer and update its tag

Multi-way Cache and Associativity

You notice there is only one way. In a direct-mapped cache. If a new memory address maps to the same set, it overwrites that entry

To reduce these misses, we use multiple ways (set associativity)

A k-way associative cache has k ways instead of one.
On a miss, the new block can go into any free way in that set; overwriting only happens when all k ways are occupied

Making our 32-byte direct-mapped cache into a 2-way set-associative cache would need 64 bytes

Lookup is similar:

Set index selects the “drawer” (set).
All k tag fields in that set are to the incoming tag.
If one matches, a multiplexer selects the correct block’s data (a hit).
If none match, it’s a miss: fetch the block from memory into one of the k ways and update that way’s tag.

The exact circuits of tag comparison and data selection can be found in the lecture slides and Harris & Harris section 8.3.2: “How Is Data Found”

Practice: Dividing Address Bits

Let’s turn this configuration into a Fundies problem and peel the onion backwards!

Given a 64-byte 2-way associative cache with a 64-bit block size, divide the address bits into tags, set index, block offset, and byte offset

Find the size of each way: 64 bytes / 2 ways = 32 bytes/way
Find the number of sets (blocks in each way): 32 bytes / (8 byte/block) = 4 sets
- 2 bits of set index (take the log)
Find the number of words in a block: (8 byte/block) / (4 byte/word) = 2 words/block
- 1 bit of block offset
Find the number of bytes in a word: always 4 bytes/word
- 2 bits of byte offset
The rest of the address bits are the tag: 32-2-1-2=27 bits

Tag    Set idx    Block off    Byte off   |   Total
27     2          1            2          |   32

easy!

Challenge Problem

Here’s a challenge problem taken from last year’s PS, which really tests your understanding of caches and address splitting:

A CPU can sometimes change its cache parameters—block size or associativity—mid-execution. In some cases, the existing blocks in the cache might be invalid because the blocks stored under the old caching policy are either misplaced or incomplete according to the new policy. For each scenario below, state whether all existing lines remain valid, or if some must be invalidated.

Hint: sketch the old and new address partitions side-by-side. Write out how many tag, index, and offset bits of each configuration before and after the change.

Direct-mapped: block size doubles from 1 word / block to 2 words / block

Answer
Invalid: Doubling requires merging two adjacent blocks into one 2-word block. You can’t guarantee both halves are already present, or they have the same tag (continuous), so those larger blocks would be incomplete or mis‐tagged.
Direct-mapped: block size halves from 2 words / block to 1 word / block

Answer
Valid: Each new 1-word block is a contiguous sub‐block of the old 2-word block. All bytes of each smaller block are already in cache, and their tag remains the same. You simply reinterpret the old tag/index bits: one former block‐offset bit becomes a set‐index bit, but no data moves or invalidations are needed.
How would your answers to 1 and 2 change if the cache were fully associative?

Answer
Same logic: merging blocks in different ways may lead to discontinuous or incomplete blocks, but splitting is fine.
Set associative, goes from 4-way to 8-way associative

Answer
Valid: With no change to block size, complete old blocks are complete new blocks. Doubling the associativity while keeping the cache and block size constant implies a halving of the number of sets. With this change, what had been the MSB of the set index in the old configuration becomes the LSB of the tag in the new configuration. This is a safe change that does not break block placement, as tag bits can take any value anywhere in the cache.
Set associative, goes from 4-way to 2-way associative

Answer
Invalid: As above, complete old blocks are complete new blocks here. However, halving the associativity while keeping cache and block size constant implies a doubling of the number of sets. With this change, what had been the LSB of the tag field becomes the MSB of the set index (the reverse of the previous scenario). Because tag bits are not involved in the set mapping, and can take any value anywhere in the cache, there is no guarantee that all of the old blocks whose old tag LSB had the same value will fall in the same new set.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Ming Gong

Demystifying Cache–From Bytes to Tags

Contents

Cache and Address Space

Cache Structure

Byte Offset

Block Offset

Set index

Mapping One-way Cache to Memory

Multi-way Cache and Associativity

Practice: Dividing Address Bits

Challenge Problem

Share on

You May Also Enjoy

42030青春版–一套刚刚好的乐高

Embedded Project: CircuitSim

So What Is Carry Look Ahead

FPGA Button UI