Socket#
Introduction#
Socket programming is the foundation of network communication in Unix-like operating systems. The Berkeley sockets API, introduced in BSD 4.2 (1983), has become the de facto standard for network programming across all major platforms including Linux, macOS, Windows, and embedded systems.
At its core, a socket is an endpoint for communication—a file descriptor that represents a connection to another process, either on the same machine or across a network. The API provides a unified interface for multiple protocols (TCP, UDP, Unix domain sockets) and address families (IPv4, IPv6).
DNS Lookup with gethostbyname#
- Source:
The gethostbyname function resolves a hostname to one or more IP addresses.
While this function is deprecated in favor of the more modern getaddrinfo,
it remains widely used in legacy codebases and is simpler for basic IPv4-only
lookups. The function returns a pointer to a hostent structure containing
the canonical hostname, alias names, address type, address length, and a
null-terminated list of network addresses. Note that gethostbyname is not
thread-safe; use getaddrinfo for concurrent applications.
#include <stdio.h>
#include <netdb.h>
#include <arpa/inet.h>
int main(int argc, char *argv[]) {
struct hostent *h = gethostbyname(argv[1]);
if (!h) return 1;
printf("Host: %s\n", h->h_name);
for (struct in_addr **p = (struct in_addr **)h->h_addr_list; *p; p++)
printf("IP: %s\n", inet_ntoa(**p));
}
$ ./gethostbyname www.google.com
Host: www.google.com
IP: 142.250.80.100
Byte Order Conversion#
- Source:
Network protocols universally use big-endian (network) byte order for
transmitting multi-byte integers, while most modern processors (x86, ARM)
use little-endian (host) byte order. The htons (host-to-network-short)
and htonl (host-to-network-long) functions convert 16-bit and 32-bit
values from host to network byte order respectively. The inverse functions
ntohs and ntohl convert from network to host byte order. These
conversions are essential for writing portable network code that works
correctly on both big-endian and little-endian machines. On big-endian
systems, these functions are no-ops.
#include <stdio.h>
#include <arpa/inet.h>
int main(void) {
uint16_t port = 8080;
uint32_t addr = 0x7f000001; // 127.0.0.1
printf("host port: 0x%x -> network: 0x%x\n", port, htons(port));
printf("host addr: 0x%x -> network: 0x%x\n", addr, htonl(addr));
}
$ ./byteorder
host port: 0x1f90 -> network: 0x901f
host addr: 0x7f000001 -> network: 0x100007f
Basic TCP Server#
- Source:
A TCP server follows the classic socket-bind-listen-accept pattern. First,
create a socket with socket(), specifying AF_INET for IPv4 and
SOCK_STREAM for TCP. Setting SO_REUSEADDR allows the server to
restart immediately without waiting for the TIME_WAIT state to expire.
The bind() call associates the socket with a local address and port,
while listen() marks it as a passive socket ready to accept connections.
The accept() call blocks until a client connects, returning a new socket
descriptor for that specific connection. This simple echo server receives
data and sends it back to the client before closing the connection.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void) {
int s = socket(AF_INET, SOCK_STREAM, 0);
int on = 1;
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
for (;;) {
int c = accept(s, NULL, NULL);
char buf[1024] = {0};
recv(c, buf, sizeof(buf) - 1, 0);
send(c, buf, strlen(buf), 0);
close(c);
}
}
$ ./tcp-server &
$ echo "Hello" | nc localhost 5566
Hello
Basic UDP Server#
- Source:
UDP (User Datagram Protocol) is a connectionless protocol, meaning there is
no handshake or persistent connection between client and server. Instead of
accept(), recv(), and send(), UDP servers use recvfrom() and
sendto() which include the remote address as a parameter. Each datagram
is independent and contains the sender’s address, allowing the server to
respond without maintaining any connection state. This makes UDP ideal for
applications where low latency is more important than guaranteed delivery,
such as DNS queries, streaming media, or online gaming. The trade-off is
that UDP provides no built-in reliability, ordering, or flow control.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void) {
int s = socket(AF_INET, SOCK_DGRAM, 0);
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
for (;;) {
char buf[1024] = {0};
struct sockaddr_in client;
socklen_t len = sizeof(client);
ssize_t n = recvfrom(s, buf, sizeof(buf), 0,
(struct sockaddr *)&client, &len);
sendto(s, buf, n, 0, (struct sockaddr *)&client, len);
}
}
$ ./udp-server &
$ echo "Hello" | nc -u localhost 5566
Hello
I/O Multiplexing with select#
- Source:
The select system call enables a single-threaded server to monitor multiple
file descriptors simultaneously, waiting until one or more become ready for
I/O operations. This is the oldest and most portable I/O multiplexing mechanism,
available on virtually all Unix-like systems and Windows. The function takes
three sets of file descriptors (read, write, exception) implemented as bitmasks,
and blocks until at least one descriptor is ready or a timeout occurs. While
select has limitations—notably the FD_SETSIZE limit (typically 1024)
and O(n) scanning overhead—it remains useful for applications with a moderate
number of connections and where portability is paramount.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/select.h>
#include <netinet/in.h>
int main(void) {
int s = socket(AF_INET, SOCK_STREAM, 0);
int on = 1;
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
fd_set master, readfds;
FD_ZERO(&master);
FD_SET(s, &master);
for (;;) {
readfds = master;
select(FD_SETSIZE, &readfds, NULL, NULL, NULL);
for (int i = 0; i < FD_SETSIZE; i++) {
if (!FD_ISSET(i, &readfds)) continue;
if (i == s) {
int c = accept(s, NULL, NULL);
FD_SET(c, &master);
} else {
char buf[1024] = {0};
ssize_t n = read(i, buf, sizeof(buf) - 1);
if (n <= 0) { close(i); FD_CLR(i, &master); }
else write(i, buf, n);
}
}
}
}
# Terminal 1
$ ./select-server &
# Terminal 2
$ nc localhost 5566
Hello
Hello
# Terminal 3 (concurrent)
$ nc localhost 5566
World
World
I/O Multiplexing with poll#
- Source:
The poll system call addresses some limitations of select by using an
array of pollfd structures instead of fixed-size bitmasks. This removes
the FD_SETSIZE limitation and allows monitoring any number of file
descriptors (limited only by system resources). Each pollfd structure
contains the file descriptor, requested events (POLLIN for readable,
POLLOUT for writable), and returned events filled in by the kernel.
While poll still requires O(n) scanning of all descriptors, it provides
a cleaner interface and better scalability than select for applications
with many connections. It is available on all POSIX-compliant systems.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <poll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define MAX_FDS 1024
int main(void) {
int s = socket(AF_INET, SOCK_STREAM, 0);
int on = 1;
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
struct pollfd fds[MAX_FDS];
fds[0].fd = s;
fds[0].events = POLLIN;
int nfds = 1;
for (;;) {
poll(fds, nfds, -1);
for (int i = 0; i < nfds; i++) {
if (!(fds[i].revents & POLLIN)) continue;
if (fds[i].fd == s) {
fds[nfds].fd = accept(s, NULL, NULL);
fds[nfds++].events = POLLIN;
} else {
char buf[1024] = {0};
ssize_t n = read(fds[i].fd, buf, sizeof(buf) - 1);
if (n <= 0) { close(fds[i].fd); fds[i] = fds[--nfds]; }
else write(fds[i].fd, buf, n);
}
}
}
}
$ ./poll-server &
$ nc localhost 5566
Hello poll
Hello poll
Multithreaded TCP Server#
- Source:
For CPU-bound workloads or when simpler programming models are preferred, spawning a dedicated thread for each client connection is a straightforward approach. The main thread continuously accepts new connections and creates a detached thread to handle each one independently. This model allows true parallel execution on multi-core systems and simplifies the code since each thread can use blocking I/O without affecting other clients. However, this approach has scalability limits: each thread consumes memory for its stack (typically 1-8 MB), and context switching overhead increases with many concurrent connections. Thread-per-connection works well for servers with moderate concurrency (hundreds of clients) or long-lived connections.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/socket.h>
#include <netinet/in.h>
void *handle(void *arg) {
int c = *(int *)arg;
char buf[1024];
ssize_t n;
while ((n = recv(c, buf, sizeof(buf), 0)) > 0)
send(c, buf, n, 0);
close(c);
return NULL;
}
int main(void) {
int s = socket(AF_INET, SOCK_STREAM, 0);
int on = 1;
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
for (;;) {
static int c;
c = accept(s, NULL, NULL);
pthread_t t;
pthread_create(&t, NULL, handle, &c);
pthread_detach(t);
}
}
$ ./pthread-server &
$ nc localhost 5566
Hello
Hello
I/O Multiplexing with epoll/kqueue#
- Source:
For high-performance servers handling thousands of concurrent connections,
Linux provides epoll and BSD/macOS provides kqueue. Unlike select
and poll which require scanning all file descriptors on each call, these
mechanisms use an event-driven model with O(1) complexity for adding/removing
descriptors and returning only the ready ones. With epoll, you create an
epoll instance with epoll_create1(), register interest in file descriptors
using epoll_ctl(), and wait for events with epoll_wait(). Similarly,
kqueue uses kqueue() to create the queue and kevent() for both
registration and waiting. This example uses preprocessor directives to compile
on both Linux (epoll) and macOS/BSD (kqueue), demonstrating portable
high-performance I/O multiplexing.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#ifdef __linux__
#include <sys/epoll.h>
#define MAX_EVENTS 64
#elif defined(__APPLE__) || defined(__FreeBSD__)
#include <sys/event.h>
#define MAX_EVENTS 64
#endif
int main(void) {
int s = socket(AF_INET, SOCK_STREAM, 0);
int on = 1;
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(5566),
.sin_addr.s_addr = INADDR_ANY
};
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
#ifdef __linux__
int eq = epoll_create1(0);
struct epoll_event ev = {.events = EPOLLIN, .data.fd = s};
epoll_ctl(eq, EPOLL_CTL_ADD, s, &ev);
struct epoll_event events[MAX_EVENTS];
for (;;) {
int n = epoll_wait(eq, events, MAX_EVENTS, -1);
for (int i = 0; i < n; i++) {
int fd = events[i].data.fd;
if (fd == s) {
int c = accept(s, NULL, NULL);
ev.events = EPOLLIN;
ev.data.fd = c;
epoll_ctl(eq, EPOLL_CTL_ADD, c, &ev);
} else {
char buf[1024] = {0};
ssize_t len = read(fd, buf, sizeof(buf) - 1);
if (len <= 0) { close(fd); }
else write(fd, buf, len);
}
}
}
#elif defined(__APPLE__) || defined(__FreeBSD__)
int kq = kqueue();
struct kevent ev;
EV_SET(&ev, s, EVFILT_READ, EV_ADD, 0, 0, NULL);
kevent(kq, &ev, 1, NULL, 0, NULL);
struct kevent events[MAX_EVENTS];
for (;;) {
int n = kevent(kq, NULL, 0, events, MAX_EVENTS, NULL);
for (int i = 0; i < n; i++) {
int fd = (int)events[i].ident;
if (fd == s) {
int c = accept(s, NULL, NULL);
EV_SET(&ev, c, EVFILT_READ, EV_ADD, 0, 0, NULL);
kevent(kq, &ev, 1, NULL, 0, NULL);
} else {
char buf[1024] = {0};
ssize_t len = read(fd, buf, sizeof(buf) - 1);
if (len <= 0) { close(fd); }
else write(fd, buf, len);
}
}
}
#endif
}
# Linux
$ ./epoll-kqueue-server &
$ nc localhost 5566
Hello epoll
Hello epoll
# macOS/BSD
$ ./epoll-kqueue-server &
$ nc localhost 5566
Hello kqueue
Hello kqueue
Unix Domain Socket#
- Source:
Unix domain sockets (also called local sockets or IPC sockets) provide
inter-process communication between processes running on the same host.
They use the filesystem namespace for addressing—the socket is represented
as a special file that must be removed before binding. Unix domain sockets
are significantly faster than TCP/IP loopback (localhost) connections because
they bypass the entire network stack: no routing, no checksums, no packet
fragmentation. They also support passing file descriptors and credentials
between processes, enabling powerful IPC patterns. Use AF_UNIX instead
of AF_INET and sockaddr_un instead of sockaddr_in for addressing.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>
int main(void) {
unlink("/tmp/echo.sock");
int s = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un addr = {.sun_family = AF_UNIX};
strncpy(addr.sun_path, "/tmp/echo.sock", sizeof(addr.sun_path) - 1);
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 10);
for (;;) {
int c = accept(s, NULL, NULL);
char buf[1024] = {0};
ssize_t n = recv(c, buf, sizeof(buf) - 1, 0);
if (n > 0) send(c, buf, n, 0);
close(c);
}
}
$ ./unix-socket &
$ nc -U /tmp/echo.sock
Hello Unix
Hello Unix
Modern DNS with getaddrinfo#
- Source:
The getaddrinfo function is the modern, protocol-independent replacement
for gethostbyname. It handles both IPv4 and IPv6 addresses transparently,
resolves service names to port numbers, and is fully thread-safe. The function
takes hints specifying the desired address family (AF_INET, AF_INET6,
or AF_UNSPEC for both), socket type, and protocol. It returns a linked
list of addrinfo structures, each containing a ready-to-use sockaddr
that can be passed directly to socket(), bind(), or connect().
Always call freeaddrinfo() to release the allocated memory. This function
is the recommended way to write network code that works seamlessly with both
IPv4 and IPv6.
#include <stdio.h>
#include <string.h>
#include <netdb.h>
#include <arpa/inet.h>
int main(int argc, char *argv[]) {
struct addrinfo hints = {.ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM};
struct addrinfo *res;
if (getaddrinfo(argv[1], NULL, &hints, &res) != 0) return 1;
for (struct addrinfo *p = res; p; p = p->ai_next) {
char ip[INET6_ADDRSTRLEN];
void *addr = p->ai_family == AF_INET
? (void *)&((struct sockaddr_in *)p->ai_addr)->sin_addr
: (void *)&((struct sockaddr_in6 *)p->ai_addr)->sin6_addr;
inet_ntop(p->ai_family, addr, ip, sizeof(ip));
printf("%s: %s\n", p->ai_family == AF_INET ? "IPv4" : "IPv6", ip);
}
freeaddrinfo(res);
}
$ ./getaddrinfo www.google.com
IPv6: 2607:f8b0:4004:800::2004
IPv4: 142.250.80.100
IP Address Conversion#
- Source:
The inet_pton (presentation to network) and inet_ntop (network to
presentation) functions convert IP addresses between human-readable text
strings and binary network format. Unlike the older inet_aton and
inet_ntoa functions, these support both IPv4 and IPv6 addresses through
the address family parameter. The inet_pton function returns 1 on success,
0 if the input is not a valid address in the specified family, or -1 on error.
The inet_ntop function returns a pointer to the destination buffer on
success or NULL on error. These functions are essential for parsing IP
addresses from configuration files or user input and for displaying addresses
in log messages or user interfaces.
#include <stdio.h>
#include <arpa/inet.h>
int main(void) {
// IPv4
struct in_addr addr4;
inet_pton(AF_INET, "192.168.1.1", &addr4);
char str4[INET_ADDRSTRLEN];
inet_ntop(AF_INET, &addr4, str4, sizeof(str4));
printf("IPv4: %s (0x%x)\n", str4, ntohl(addr4.s_addr));
// IPv6
struct in6_addr addr6;
inet_pton(AF_INET6, "::1", &addr6);
char str6[INET6_ADDRSTRLEN];
inet_ntop(AF_INET6, &addr6, str6, sizeof(str6));
printf("IPv6: %s\n", str6);
}
$ ./inet-conv
IPv4: 192.168.1.1 (0xc0a80101)
IPv6: ::1
Pipe Communication#
- Source:
Pipes provide unidirectional inter-process communication. The pipe() system
call creates a pair of file descriptors: fd[0] for reading and fd[1]
for writing. Data written to the write end can be read from the read end in
FIFO order. Pipes are commonly used after fork() to establish communication
between parent and child processes. The typical pattern is for one process to
close the read end and write, while the other closes the write end and reads.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(void) {
int fd[2];
pipe(fd);
pid_t pid = fork();
if (pid == 0) {
close(fd[1]);
char buf[128];
read(fd[0], buf, sizeof(buf));
printf("Child received: %s\n", buf);
close(fd[0]);
} else {
close(fd[0]);
const char *msg = "Hello from parent";
write(fd[1], msg, strlen(msg) + 1);
close(fd[1]);
}
}
$ ./pipe
Child received: Hello from parent
Bidirectional IPC with socketpair#
- Source:
Unlike pipe() which is unidirectional, socketpair() creates a pair of
connected Unix domain sockets that support bidirectional communication. Each
end can both read and write, making it ideal for two-way IPC between related
processes. This is simpler than creating two pipes for full-duplex communication
and provides socket semantics including the ability to pass file descriptors
between processes using SCM_RIGHTS.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/wait.h>
int main(void) {
int fd[2];
socketpair(AF_UNIX, SOCK_STREAM, 0, fd);
pid_t pid = fork();
if (pid == 0) {
close(fd[0]);
char buf[128];
read(fd[1], buf, sizeof(buf));
printf("Child got: %s\n", buf);
write(fd[1], "Hi parent", 10);
close(fd[1]);
} else {
close(fd[1]);
write(fd[0], "Hi child", 9);
char buf[128];
read(fd[0], buf, sizeof(buf));
printf("Parent got: %s\n", buf);
close(fd[0]);
wait(NULL);
}
}
$ ./socketpair
Child got: Hi child
Parent got: Hi parent