I started to reverse engineer APFS and want to share what I found out so far. You can send me feedback and ideas on this post via Twitter.
Notice: I created a test image with macOS Sierra 10.12.3 (16D32). All results are guesses and the reverse engineering is work in progress. Also newer versions of APFS might change structures. The information below is neither complete nor proven to be correct.
APFS is structured in a single container that can contain multiple APFS volumes. A container needs to be >512 MB to contain more than one volume, >1024MB to contain more than two volumes and so on. The following image shows an overview of the APFs structure.
Each element of this structure (except for the allocation file) starts with a 32 byte block header, which contains some general information about the block. Afterwards the body of the structure is following. The following types exist:
- 0x01: Container Superblock
- 0x02: Node
- 0x05: Spacemanager
- 0x07: Allocation Info File
- 0x11: Unknown
- 0x0B: B-Tree
- 0x0C: Checkpoint
- 0x0D: Volume Superblock
Each of this structures is described in detail below. A more detailed version of the APDS structure is available as a Kaitai struct file: apfs.ksy. You can use it to examine APFS dumps in the Kaitai IDE or create parsers for various languages. This .ksy file must considered experimental.
General information:
- The file system uses litte-endian values for storing information
- Timestamps are 64bit nanoseconds (1⁄1,000,000,000 seconds!) starting from 1.1.1970 UTC (unix epoch). The current timestamp is around
0x14b11800f375e000
. - Standard block size seams to be 4096 byte per block.
- APFS is a copy-on-write filesystem so each block is copied before changes are applied so a history of all unoverwritten files and filesystem structures exists. This hight result in a huge amount of forensic artefacts.
Structures
Each file system structure in APFS starts with a block header. This header starts with a checksum for the whole block. According to the apple docs the Fletcher’s checksum algorithm is used. Other informations in the header include the copy-on-write version of the block, the block id and the block type.
pos | size | type | id |
0 | 8 | uint64 | checksum |
8 | 8 | uint64 | block_id |
16 | 8 | uint64 | version |
24 | 2 | uint16 | block_type |
26 | 2 | uint16 | flags |
28 | 4 | uint32 | padding |
Container Superblock
The container superblock is the entry point to the filesystem. Because of the structure with containers and flexible volumes, allocation needs to handled on a container level. The container superblock contains information on the blocksize, the number of blocks and pointers to the spacemanager for this task. Additionally the block IDs of all volumes are stored in the superblock. To map block IDs to block offsets a pointer to a block map b-tree is stores. This b-tree contains entries for each volume with its ID and offset.
pos | size | type | id |
0 | 4 | byte | magic “NXSB” |
4 | 4 | uint32 | blocksize |
8 | 8 | uint64 | totalblocks |
40 | 16 | byte | guid |
56 | 8 | uint64 | next_free_block_id |
64 | 8 | uint64 | next_version |
104 | 4 | uint32 | previous_containersuperblock_block |
120 | 8 | uint64 | spaceman_id |
128 | 8 | uint64 | block_map_block |
136 | 8 | uint64 | unknown_id |
144 | 4 | uint32 | padding2 |
148 | 4 | uint32 | apfs_count |
152 | 8 | uint64 | offset_apfs (repeat apfs_count times) |
Node
Nodes are flexible containers that are used for storing different kinds entries. Nodes can either contain flexible or fixed sized entries. A node starts with a list of pointers to the entry keys and entry records. This way for each entry the node contains an entry header at the beginning of the node, an entry key in the middle of the node and an entry record at the end of the node.
pos | size | type | id |
0 | 4 | uint32 | alignment |
4 | 4 | uint32 | entry_count |
10 | 2 | uint16 | head_size |
16 | 8 | entry | meta_entry |
24 | … | entry | entries (repeat entry_count times) |
Spacemanager
The spacemanager (sometimes spaceman) is used to manage allocated blocks in the APFS container. The number of free blocks and a pointer to the allocation info file(s?) are stored here.
pos | size | type | id |
0 | 4 | uint32 | blocksize |
16 | 8 | uint64 | totalblocks |
40 | 8 | uint64 | freeblocks |
144 | 8 | uint64 | prev_allocationinfofile_block |
352 | 8 | uint64 | allocationinfofile_block |
Allocation Info File
The allocation info file works as a missing header for the allocation file. The allocation files length, version and the offset of the allocation file are stored here.
pos | size | type | id |
4 | 4 | uint32 | alloc_file_length |
8 | 4 | uint32 | alloc_file_version |
24 | 4 | uint32 | total_blocks |
28 | 4 | uint32 | free_blocks |
32 | 4 | uint32 | allocationfile_block |
Unknown
The structure with type 0x11 is quite empty and seams to be related to the spacemanager as it occurs adjacent to it. Its purpose it unknown.
B-Tree
B-trees manage multiple nodes. They contain the offset of the root node.
pos | size | type | id |
16 | 8 | uint64 | root |
Checkpoint
A checkpoint structure exists for every container superblock. But I have no clue what it is good for.
Volume Superblock
A volume superblock exists for each volume in the file system. It contains the name of the volume, an ID and a timestamp. Similarly to the container superblock it contains a pointer to a block map which maps block IDs to bock offsets. Additionally a pointer to the root directory, which is stored as a node, is stored in the volume superblock.
pos | size | type | id |
0 | 4 | byte | magic “APSB” |
96 | 8 | uint64 | block_map |
104 | 8 | uint64 | root_dir_id |
112 | 8 | uint64 | pointer3 |
120 | 8 | uint64 | pointer4 |
208 | 16 | byte | guid |
224 | 8 | uint64 | time1 |
272 | 8 | uint64 | time2 |
672 | 8 | str(ASCII) | name |
Allocation File
Allocation files are simple bitmaps. They do not have a block header and therefore no type id.