The man page for du ('disk usage') states that it ‘summarize[s] disk usage of the set of FILES [(emphasis theirs)], recursively for directories.’ In other words, it searches a directory and lists all objects in it (see ‘How du operates’ below for a deeper explanation of this process). Du is a GNU coreutil ('core utility'), has 29 flags, runs more quickly and accurately on a well-indexed system, and can break down results by size, inode information and location. By definition du seems to overlap functions with df ('disk free'), but this guide will explain the differences between the two and how to optimize the use of these powerful commands.
It is important to note that the same principles that facilitate the use of du also apply to df – for instance, being able to limit results by type and block size ('(2) in data management, a block is a group of records on a storage device.' see 'Additional Resources').
A Linux or Mac OS terminal. Linux Academy and hypervisor virtual machines (e.g., VirtualBox or VMWare) and/or using PuTTy in Windows to connect to a Linux shell environment will provide access as well. Cygwin is a Windows-based Bash terminal, but lacks the functionality of a full Linux shell.
Optional, but recommended for this tutorial
Basic knowledge of ls, tail, head, wc, less and pipes in Linux
The standard syntax for du is 'du [OPTION] [directory or file path]’. One can type the path before the flag and still produce the same result.
Running 'du’ in a terminal normally generates many pages of output. Results will vary depending on which directory is searched and how many installations and user-created files exist, but most systems will generate at least 2000 lines. Adding the '-a' (i.e., '--all') flag will access every single file in the searched directory, making your results much larger than du alone. Searching the root ( / )directory will scan the entire filesystem, but one can restrict the search to a location by navigating to it and running du or including the full path (e.g., du /home/user).
To demonstrate how large the results can be, pipe du to a 'word count' as root or sudo (provided the user has root privileges in the /etc/sudoers file). Either method will avoid most 'permission denied' errors:
then type in root password, or
sudo du | wc
sudo du -a | wc
This will list a total of how many objects will print to the screen, separated by newline, word and byte count (see image below, run in /etc on a Linux Academy Red Hat Enterprise 7 VM).
One can also run both commands on one line, as in 'sudo du | wc && sudo du -a | wc'.
The results of du alone will easily exceed 2000 lines on most machines, but adding the -a, or --all, flag will access every single file in the searched directory - making your results much larger. Running it in root will scan the entire filesystem, but one can restrict a disk usage search to a directory by navigating to it and running du or putting the full path after the command (e.g., du /home/user). Piping du through 'less' makes it easier to navigate, but one can narrow down the results using a few flags and pipes. Doing so will speed up the search and utilize fewer system resources.
There are 29 du parameters, but two are ‘help’ and ‘version’ and several will likely not be used on a regular basis. Many Linux command flags have a short name/long name format, and du is no exception. For instance:
-a or --all
-L or --dereference (dereferences, or ignores, all symbolic links; think ‘L’ for ‘link’)
-c or --total (produces a grand total of files in the system; this will be the same with or without combining it with the -a flag)
-h or --human-readable format will output file sizes in bytes, kilobytes, megabytes, and gigabytes.
There are other ways to sort through the du’s massive output. A common method for any command would be to use 'head' and 'tail,' to narrow it down to the first or last lines.
'Head –n [number]' or 'tail –n [number]' will produce the first or last ten lines, respectively (or number of lines specified by [number] if -n is included).
The simplest way to confine results is to examine them by size.
To output anything smaller than SIZE by putting a positive sign before SIZE and anything greater by including a negative sign.
will only output objects that are greater than two gigabytes, while
will only output objects smaller than one gigabyte.
-d, or --max-depth=N
will limit results to N or fewer levels below the level specified
-k is equivalent to --block-size=1K
will not print files that match PATTERN, and
-X, or --exclude-from=FILE
will not print a pattern specified in FILE.
How du operates
So how does du get its information? It reads inodes, or ‘index nodes’ (in a *Nix system, these are data structures used to represent a filesystem object. Each inode contains information such as a file’s permissions and modification date). Since everything in a Linux system is a file and has an inode, du can read each file’s inode number (or ‘i-number,’ essentially a unique identifier for each file). To further illustrate this, run
on a directory, which will list all objects and their i-numbers in that directory:
is similar in that it lists the i-number for everything in the location that was searched, but will not display file permissions, modification date or other data that ls provides (NB: one can add --time or --time=WORD to du for timestamps).
While there are several ways to parse through objects in a *Nix system or estimate the amount of file space utilized, du is an efficient tool for examining individual objects. Feel free to try this out on any Linux Academy virtual machine or by using other options listed in the ‘Requirements’ section.
Determine disk space usage with df
Df, or 'disk free,' is another coreutil and possibly the most common way to determine how much storage has been used on a particular filesystem. The man page definition is 'report file system disk space usage.' Df can only show information for mounted filesystems; the man page link in 'Additional Resources' provides a brief explanation for why searching unmounted filesystems is not possible for 'this version of df,' the stated reason being '…[it] requires very nonportable intimate knowledge of file system structures'.
Df basic usage
Let's try out the basics. Running 'df' outputs disk usage for all mounted filesystems, and displays the data in six columns: Filesystem, 1K-blocks, Used, Available, Use%, and Mounted On. The following example also covers 'df -a.'
Df advanced usage
While it has fewer flags than du, df is more useful for drive partitioning and can run over networks rather than being confined to a local system. Unlike du, it will produce the same results regardless of the working directory, since it does not scan directories for files but reads partitions on a hard drive or virtual machine (running in / will yield the same output as /home, for example). Below are four commonly-used flags that will serve the purposes of most Linux users.
-a, --all: 'includes pseudo, duplicate and inaccessible file systems'
-h, --human-readable, which users may want to run by default
-t, --type=TYPE: df will only list filesystem(s) specified by TYPE;
e.g., 'df --type=tmpfs' will output only tmpfs:
-x, --exclude-type=TYPE: df will not list filesystem(s) specified by TYPE:
As a side note, the '–v' flag does not produce verbose output as in many Linux commands, nor does it seem to serve any other purpose. It is likely deprecated but has not been removed from the source code.
'df --inode' is similar to du's --inode flag in that it lists inode information instead of block usage.
Finally, --sync' and '--no-sync' will likely not come into play unless filesystems have files that are queued to be written to disk. --no-sync is run by default, but run --sync to receive updated information on disk usage.
Both du and df are essential Linux commands that facilitate system administration for professionals and casual users alike. Remember that though they are similar, they are responsible for different tasks: while du operates on objects, df looks at filesystems, often cited in the Linux community for producing persistently different results. Their man pages have an exhaustive list of parameters, but simply using the commands is often the fastest way to grow your Linux expertise.
A Linux Academy tutorial on du, df (and mount) in Linux+ LPIC Level 1 Exam 1:
Linux man pages for du and df:
The GNU coreutils guides to du and df: