Binary Tools#
Introduction#
Binary inspection tools examine compiled executables, object files, and shared libraries without running them. They reveal symbol tables, section layouts, dependencies, and embedded strings—essential for debugging linker errors, understanding library dependencies, and reverse engineering binaries.
On Linux, binaries use the ELF (Executable and Linkable Format) and tools like
readelf, objdump, and ldd are the standard way to inspect them. On
macOS, binaries use the Mach-O format and the equivalent tool is otool.
Some tools like nm, strings, size, file, strip, and
c++filt work on both platforms. When debugging “undefined reference” linker
errors, checking shared library dependencies, understanding why a binary is
large, or verifying that symbols are exported correctly, these tools are
indispensable.
- Source:
All examples below use the binary-inspect_test executable built from
src/debug/binary-inspect/. The companion script inspect.sh exercises
every tool covered in this note against the compiled binary—run it standalone
or through ctest to verify the commands work on your system.
# Build and run
$ cmake -B build -S . && cmake --build build --target binary-inspect_test
$ ./src/debug/binary-inspect/inspect.sh ./build/src/debug/binary-inspect/binary-inspect_test
Cheatsheet#
# --- file: identify binary type & arch ---
file ./binary
file /usr/lib/x86_64-linux-gnu/libc.so.6
# --- nm: list symbols ---
nm ./binary # all symbols (mangled)
nm -C ./binary # demangle C++ names
nm -u ./binary # undefined symbols only
nm -g ./binary # global (external) symbols only
nm -n ./binary # sort by address
nm -S ./binary # show symbol sizes
nm -DC /usr/lib/libm.so.6 # dynamic symbols in shared lib
# --- readelf: inspect ELF structure ---
readelf -h ./binary # ELF header (arch, entry point)
readelf -S ./binary # section headers (.text .data .bss ...)
readelf -s ./binary # symbol table
readelf -l ./binary # program headers (LOAD segments)
readelf -d ./binary # dynamic section (NEEDED libs)
readelf -r ./binary # relocations
readelf -n ./binary # notes (build-id, ABI)
readelf -a ./binary # all of the above
readelf --debug-dump=info ./binary # DWARF debug info
readelf -p .rodata ./binary # dump .rodata as strings
# --- objdump: disassemble ---
objdump -d -C ./binary # disassemble with demangling
objdump -d -S -C ./binary # interleave source (needs -g)
objdump -s -j .rodata ./binary # hex dump of .rodata section
objdump -t -C ./binary # symbol table
objdump -R ./binary # dynamic relocations
# --- ldd: shared library dependencies ---
ldd ./binary # list required .so files
# WARNING: ldd runs the dynamic linker — don't use on untrusted binaries
# Use readelf -d instead for untrusted files
# --- strings: extract printable strings ---
strings ./binary # default min length 4
strings -n 10 ./binary # min length 10
strings -t x ./binary # show hex offset of each string
strings ./binary | grep -i version
# --- size: section sizes ---
size ./binary # text, data, bss totals
size -A ./binary # all sections with sizes
# --- strip: remove symbols ---
strip ./binary # strip in place
strip -o stripped ./binary # strip to new file
strip --strip-debug ./binary # remove debug info only, keep symbols
# --- c++filt: demangle ---
echo "_ZN4demo6Widget5valueEv" | c++filt
nm ./binary | c++filt | grep demo
# --- ar: static library inspection ---
ar -t libfoo.a # list object files in archive
ar -x libfoo.a # extract all object files
nm -C libfoo.a # list symbols across all members
# --- otool: macOS equivalent of readelf/objdump/ldd ---
otool -h ./binary # Mach-O header (magic, cputype, filetype)
otool -l ./binary # load commands (segments, sections)
otool -L ./binary # shared library dependencies (like ldd)
otool -tV ./binary # disassemble __TEXT,__text
otool -s __TEXT __cstring ./binary # dump a specific section
otool -p _add -tV ./binary # disassemble starting at symbol _add
# --- install_name_tool: macOS rpath/dylib management ---
install_name_tool -change old.dylib new.dylib ./binary
install_name_tool -add_rpath /usr/local/lib ./binary
nm: Symbol Table#
nm lists symbols in object files and executables. Each line shows an address,
a symbol type letter, and the symbol name. The type letter tells you where the
symbol lives: T for code in .text, D for initialized data in
.data, B for zero-initialized data in .bss, R for read-only
data in .rodata, and U for undefined symbols that must be resolved by
the linker. Uppercase means the symbol is global (externally visible), lowercase
means it is local to the translation unit. This is the first tool to reach for
when debugging linker errors like “undefined reference” or “multiple definition”.
Common symbol types:
T/t Code (.text) — global/local
D/d Initialized data (.data) — global/local
B/b Uninitialized data (.bss) — global/local
R/r Read-only data (.rodata) — global/local
U Undefined (needs linking)
W/w Weak symbol
List all symbols:
$ nm build/src/debug/binary-inspect/binary-inspect_test
C++ symbols are mangled. Use c++filt or -C to demangle:
$ nm -C build/src/debug/binary-inspect/binary-inspect_test
Filter by symbol type:
# Only undefined symbols (external dependencies)
$ nm -u build/src/debug/binary-inspect/binary-inspect_test
# Only defined symbols, sorted by address
$ nm -n build/src/debug/binary-inspect/binary-inspect_test
# Only global (external) symbols
$ nm -g build/src/debug/binary-inspect/binary-inspect_test
Inspect an object file to see what it exports:
$ nm -C build/src/debug/binary-inspect/CMakeFiles/binary-inspect_test.dir/helper.cc.o
# Expected output:
# 0000000000000000 D global_initialized
# 0000000000000004 C global_uninitialized
# 0000000000000000 T add(int, int)
readelf: ELF Structure#
readelf displays detailed information about ELF (Executable and Linkable
Format) files—the standard binary format on Linux. Unlike objdump, it does
not rely on the BFD library and can parse corrupted or partially linked files.
It is the most comprehensive tool for inspecting ELF metadata: the file header
reveals architecture and entry point, section headers show how the binary is
organized (.text, .data, .bss, .rodata, etc.), program headers
describe how the kernel loads segments into memory, and the dynamic section
lists shared library dependencies. It also reads DWARF debug information, GNU
notes (build-id), and relocation entries. This is a Linux-only tool; on macOS,
use otool -l for similar load command inspection.
ELF header (architecture, entry point, type):
$ readelf -h build/src/debug/binary-inspect/binary-inspect_test
# Shows: Class (32/64-bit), OS/ABI, Type (EXEC/DYN), Entry point
Section headers (all sections in the binary):
$ readelf -S build/src/debug/binary-inspect/binary-inspect_test
# Shows: .text, .data, .bss, .rodata, .symtab, .strtab, etc.
Symbol table:
$ readelf -s build/src/debug/binary-inspect/binary-inspect_test
# With demangling (readelf doesn't have -C, pipe through c++filt)
$ readelf -s build/src/debug/binary-inspect/binary-inspect_test | c++filt
Program headers (segments loaded at runtime):
$ readelf -l build/src/debug/binary-inspect/binary-inspect_test
# Shows LOAD, DYNAMIC, INTERP segments
Dynamic section (shared library dependencies):
$ readelf -d build/src/debug/binary-inspect/binary-inspect_test
# Shows NEEDED entries (libstdc++, libgtest, libc, etc.)
DWARF debug info (if compiled with -g):
$ readelf --debug-dump=info build/src/debug/binary-inspect/binary-inspect_test | head -50
objdump: Disassembly#
objdump disassembles machine code back to assembly and displays section
contents. It is the go-to tool when you need to see what the compiler actually
generated—whether to verify an optimization kicked in, understand a crash at a
specific instruction address, or compare codegen between compiler flags. With
-S and debug info (-g), it interleaves source lines with assembly,
making it easy to correlate C/C++ code with the generated instructions. It can
also dump raw section contents as hex (-s -j .rodata), which is useful for
inspecting string literals, vtables, or constant data embedded in the binary.
On macOS, otool -tV provides similar disassembly for Mach-O files, and
the LLVM objdump (shipped with Xcode) supports both formats.
Disassemble all code:
$ objdump -d -C build/src/debug/binary-inspect/binary-inspect_test
Disassemble a specific function:
$ objdump -d -C build/src/debug/binary-inspect/binary-inspect_test | \
sed -n '/<add(int, int)>:/,/^$/p'
Interleave source with assembly (requires -g):
$ objdump -d -S -C build/src/debug/binary-inspect/binary-inspect_test
Display section contents as hex:
# Show .rodata (read-only data like string literals)
$ objdump -s -j .rodata build/src/debug/binary-inspect/binary-inspect_test
file: File Type#
file identifies file type, architecture, and linking mode by reading magic
bytes and metadata from the file header. It instantly tells you whether a binary
is 32-bit or 64-bit, statically or dynamically linked, stripped or not stripped,
and what architecture it targets (x86-64, ARM64, etc.). This is often the first
command to run when you receive an unfamiliar binary or need to confirm that a
cross-compilation produced the correct target format. It works on any file type,
not just executables—object files, shared libraries, core dumps, and even
scripts are all recognized:
$ file build/src/debug/binary-inspect/binary-inspect_test
# ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux),
# dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
# with debug_info, not stripped
Key information: 32/64-bit, static/dynamic linking, stripped/not stripped.
strings: Embedded Strings#
strings scans a binary file and extracts sequences of printable characters
(by default, 4 or more consecutive printable bytes). This is surprisingly useful
in practice: you can find hardcoded file paths, configuration values, error
messages, version strings, URLs, SQL queries, and even accidentally embedded
credentials. It works on any file—not just executables—so you can also use it
on core dumps, firmware images, or unknown binary blobs to get a quick sense of
what they contain:
$ strings build/src/debug/binary-inspect/binary-inspect_test | grep "binary-inspect"
# binary-inspect v1.0
# Minimum string length of 10 characters
$ strings -n 10 build/src/debug/binary-inspect/binary-inspect_test
# Show file offset of each string
$ strings -t x build/src/debug/binary-inspect/binary-inspect_test
Useful for finding hardcoded paths, version strings, error messages, and configuration values embedded in binaries.
size: Section Sizes#
size shows the size of each major section in an object file or executable,
giving a quick overview of how binary space is distributed between code
(.text), initialized data (.data), and zero-initialized data (.bss).
This is the fastest way to answer “why is my binary so large?”—if .text is
huge, you have a lot of code (possibly from heavy template instantiation or
inlining); if .data is large, you have big initialized globals or static
arrays; if .bss is large, you have large uninitialized buffers. The -A
flag shows every section individually, which reveals additional contributors
like debug info, exception handling tables, and string tables:
$ size build/src/debug/binary-inspect/binary-inspect_test
# text data bss dec hex filename
# ... ... ... ... ... binary-inspect_test
# Berkeley format (default) vs System V format
$ size -A build/src/debug/binary-inspect/binary-inspect_test
# Shows every section with size and address
Compare object files to see code size impact:
$ size build/src/debug/binary-inspect/CMakeFiles/binary-inspect_test.dir/helper.cc.o
$ size build/src/debug/binary-inspect/CMakeFiles/binary-inspect_test.dir/main.cc.o
strip: Remove Symbols#
strip removes symbol table entries and debug information from a binary,
significantly reducing its size. Production deployments typically strip binaries
to save disk space and memory, and to make reverse engineering harder. A debug
build with -g can easily be 5–10x larger than its stripped counterpart.
You can strip everything (strip binary), strip only debug info while keeping
the symbol table (strip --strip-debug), or output to a separate file
(strip -o). A common workflow is to keep an unstripped copy for debugging
and deploy the stripped version—if a crash occurs, you can use the unstripped
binary with GDB to get full symbol and line information from the core dump:
# Check size before
$ ls -la build/src/debug/binary-inspect/binary-inspect_test
# Strip (use a copy to preserve the original)
$ cp build/src/debug/binary-inspect/binary-inspect_test /tmp/stripped_test
$ strip /tmp/stripped_test
# Check size after
$ ls -la /tmp/stripped_test
# Verify symbols are gone
$ nm /tmp/stripped_test
# nm: /tmp/stripped_test: no symbols
c++filt: Demangle C++ Symbols#
c++filt translates mangled C++ symbol names back to human-readable form.
C++ compilers encode function signatures into symbol names to support
overloading, namespaces, and templates—so _ZN4demo6Widget5valueEv becomes
demo::Widget::value(). Most tools like nm and objdump support -C
for built-in demangling, but c++filt is useful as a standalone filter when
piping output from tools that don’t have a demangle flag (like readelf) or
when you encounter a mangled name in a log file or crash report:
$ echo "_ZN4demo6Widget5valueEv" | c++filt
# demo::Widget::value()
$ nm build/src/debug/binary-inspect/binary-inspect_test | c++filt | grep demo
# ... T demo::Widget::Widget(int)
# ... T demo::Widget::value() const
Practical Examples#
Debugging “undefined reference” linker errors:
# Find which object file defines a symbol
$ nm -C build/src/debug/binary-inspect/CMakeFiles/binary-inspect_test.dir/*.o | grep "T add"
# Find which library provides a symbol
$ nm -DC /usr/lib/x86_64-linux-gnu/libm.so.6 | grep " T sin"
Checking if a binary is stripped:
$ file build/src/debug/binary-inspect/binary-inspect_test | grep -c "not stripped"
Finding why a binary is large:
$ size -A build/src/debug/binary-inspect/binary-inspect_test | sort -k2 -n -r | head