Pointers, Functions, Structures, and Dynamic Memory in C

In this post I explain pointers, functions, the preprocessor, structures, and dynamic memory in C, from core rules to practical safe-code patterns.
TopicsC

This material targets C23 while keeping compatibility with older C11 and C17 standards. To verify the examples, use -std=c23 -Wall -Wextra -Wpedantic -Werror and treat warnings as strictly as compilation errors.

Pointers

Pointers let you store the address of an object in memory and work with that object indirectly. This is the key mechanism behind arrays, strings, pass-by-address patterns, and dynamic memory.

Formal rules
  • You may dereference only a valid pointer to an existing object of a suitable type.
  • Pointer arithmetic is defined only within a single array and the one-past position just beyond its end.
  • Ordering comparison of pointers is valid only for elements of the same array.
  • After free, the pointer becomes dangling, and reading or dereferencing it is invalid.

What pointers are

A pointer is a variable whose value is the address of another object. The pointer type defines how memory at that address is interpreted: int * reads an integer, double * reads a floating-point value, and char * reads character bytes.

The most common beginner mistake is dereferencing an uninitialized or null pointer. The safe pattern is simple: initialize pointers immediately, check for NULL before access, and do not keep an address longer than the pointed object lives.

#include <stdio.h>

int main(void) {
  int value = 42;
  int *ptr = &value;

  printf("value = %d\n", value);
  printf("*ptr = %d\n", *ptr);

  *ptr = 100;
  printf("value after pointer write = %d\n", value);
  return 0;
}

Expected output:

value = 42
*ptr = 42
value after pointer write = 100

Pointer operations

Core pointer operations are taking an address with &, dereferencing with *, assigning an address, checking against NULL, and comparing for equality or inequality. In real code, most pointer work reduces to these primitives.

It is important not to confuse the address with the value. If you write int *p, then p stores an address, while *p is the value at that address.

#include <stddef.h>
#include <stdio.h>

int main(void) {
  int x = 10;
  int y = 20;
  int *p = &x;

  printf("p points to x: %d\n", *p);
  p = &y;
  *p += 5;
  printf("y after update: %d\n", y);

  p = NULL;
  printf("p == NULL -> %d\n", p == NULL);
  return 0;
}

Expected output:

p points to x: 10
y after update: 25
p == NULL -> 1

Pointer arithmetic

When you add one to a pointer, it moves by the size of one element of its type. For int * that is usually 4 bytes; for double * it is usually 8. The difference between two pointers is the number of elements between them, not the number of bytes.

Going outside array bounds is undefined behavior. The one-past position is allowed for comparisons and distance calculations, but dereferencing a one-past pointer is forbidden.

one-past means the position immediately after the last element of the same array. For an array of 5 elements arr, that is arr + 5 (equivalently &arr[5]). Such a pointer does not refer to a real object, so *one_past is invalid, but comparisons and distance calculations are still safe.

#include <stddef.h>
#include <stdio.h>

int main(void) {
  int arr[] = {10, 20, 30, 40, 50};
  int *a = &arr[1];
  int *b = &arr[4];
  int *one_past = arr + 5; // position immediately after arr[4]

  printf("*a = %d\n", *a);
  printf("*(a + 2) = %d\n", *(a + 2));
  printf("b - a = %td\n", b - a);
  printf("one_past == &arr[5] -> %d\n", one_past == &arr[5]);
  printf("distance(one_past - a) = %td\n", one_past - a);

  for (int *p = arr; p != one_past; p++) {
    printf("%d ", *p);
  }
  printf("\n");

  // printf("%d\n", *one_past); // UB: you cannot dereference one-past
  return 0;
}

Expected output:

*a = 20
*(a + 2) = 40
b - a = 3
one_past == &arr[5] -> 1
distance(one_past - a) = 4
10 20 30 40 50 

const and pointers

Three forms matter most: const int *p means the object cannot be modified through the pointer,int *const p means the pointer itself cannot be rebound, and const int *const p means both restrictions apply.

Mistakes in const placement produce incorrect API contracts. If a function must not modify input data, use const in the signature. That helps both the reader and the compiler.

#include <stdio.h>

int main(void) {
  int value = 7;
  int other = 11;

  const int *ptr_to_const = &value;
  int *const const_ptr = &value;
  const int *const const_both = &value;

  ptr_to_const = &other;
  *const_ptr = 8;

  printf("*ptr_to_const = %d\n", *ptr_to_const);
  printf("value via const_ptr = %d\n", value);
  printf("*const_both = %d\n", *const_both);
  return 0;
}

Expected output:

*ptr_to_const = 11
value via const_ptr = 8
*const_both = 8

Pointers and arrays

In most expressions, an array name implicitly converts to a pointer to its first element. That is why botharr[i] and *(arr + i) read the same element.

The critical nuance is that inside a function, an array parameter is already a pointer. That is why sizeof(arr) in a parameter gives the pointer size, not the original array length. Length must be passed separately.

#include <stddef.h>
#include <stdio.h>

int sum_array(const int *arr, size_t n) {
  int sum = 0;
  for (size_t i = 0; i < n; i++) {
    sum += *(arr + i);
  }
  return sum;
}

int main(void) {
  int numbers[] = {3, 6, 9, 12};
  size_t n = sizeof(numbers) / sizeof(numbers[0]);
  printf("sum = %d\n", sum_array(numbers, n));
  return 0;
}

Expected output:

sum = 30

Pointers and strings

A string in C is an array of characters terminated by the null byte '\0'. A string literal is normally stored in read-only memory, so it should be kept as const char *.

Trying to modify a string literal is undefined behavior. If the string must be mutable, create a writable array such as char text[] = "..." instead of a pointer to a literal.

#include <stdio.h>

int main(void) {
  char mutable_text[] = "hello";
  const char *literal = "world";

  char *p = mutable_text;
  p[0] = 'H';

  printf("mutable_text = %s\n", mutable_text);
  printf("literal = %s\n", literal);
  return 0;
}

Expected output:

mutable_text = Hello
literal = world

Arrays of pointers and multilevel indirection

An array of pointers stores addresses of several objects of the same type. This is the basic tool for string tables, tables of function addresses, and interfaces like char **argv.

Multilevel indirection such as int ** or char *** is useful when a function needs to modify the caller's pointer itself. At every level of indirection, you need to understand clearly what address the current variable stores.

#include <stdio.h>

int main(void) {
  const char *colors[] = {"red", "green", "blue"};
  const char **pp = colors;

  int value = 123;
  int *p = &value;
  int **pp_int = &p;

  printf("colors[1] = %s\n", colors[1]);
  printf("*(pp + 2) = %s\n", *(pp + 2));
  printf("**pp_int = %d\n", **pp_int);
  return 0;
}

Expected output:

colors[1] = green
*(pp + 2) = blue
**pp_int = 123

Functions

Functions are the basis of modularity in C. They isolate responsibility, define parameter and result contracts, and create extension points through function pointers.

Formal rules
  • A function must be declared before the first call, via a prototype or a full definition.
  • Arguments are passed by value; pass a pointer when you need a by-reference effect.
  • Recursion needs a correct base case, otherwise the call stack will overflow.
  • Calling through an incompatible function-pointer type is undefined behavior.

Declaring and defining functions

A declaration, or prototype, fixes the signature: the name, result type, and parameters. A definition contains the function body. Separating the two is useful for interface headers and implementations in separate files.

If the prototype and the definition disagree, you get compilation errors or dangerous conversions at the call site. That is why prototypes should always stay in sync with the implementation.

#include <stdio.h>

int square(int x);

int square(int x) {
  return x * x;
}

int main(void) {
  printf("square(9) = %d\n", square(9));
  return 0;
}

Expected output:

square(9) = 81

Function parameters

Function parameters are local variables initialized from the argument values. In C, arguments are always passed by value, even when the argument itself is a pointer.

That is why modifying a normal parameter inside the function does not change the caller's original variable. To modify external data, pass the address of that object.

#include <stdio.h>

void inc_copy(int x) {
  x++;
  printf("inside inc_copy: x = %d\n", x);
}

int main(void) {
  int n = 10;
  inc_copy(n);
  printf("outside after call: n = %d\n", n);
  return 0;
}

Expected output:

inside inc_copy: x = 11
outside after call: n = 10

Function result

The result type defines what the function returns via return. That can be a scalar, a pointer, or even a structure if the copy cost is acceptable.

You must not return the address of an automatic local variable: once the function exits, it is destroyed and the pointer becomes dangling. For more complex results, use a structure or an external buffer.

#include <stdio.h>

typedef struct {
  int sum;
  int diff;
} PairResult;

PairResult calc(int a, int b) {
  PairResult r;
  r.sum = a + b;
  r.diff = a - b;
  return r;
}

int main(void) {
  PairResult r = calc(11, 4);
  printf("sum = %d, diff = %d\n", r.sum, r.diff);
  return 0;
}

Expected output:

sum = 15, diff = 7

Recursive functions

Recursion means a function calling itself to solve a smaller subproblem. That approach is useful for trees, graph traversals, and tasks that naturally break down into repeated steps.

The main risk is a missing stopping condition or too much call depth. In that case the program may terminate with a stack overflow, so always define the base case explicitly.

#include <stdio.h>

unsigned long long factorial(unsigned int n) {
  if (n <= 1) {
    return 1ULL;
  }
  return n * factorial(n - 1);
}

int main(void) {
  printf("factorial(5) = %llu\n", factorial(5));
  return 0;
}

Expected output:

factorial(5) = 120

Variable scope

Scope defines where a variable name is visible. Function-local variables are visible only inside the function, while block-local variables are visible only inside the corresponding block { ... }.

Shadowing, where an inner name hides an outer one, often makes code harder to read and masks bugs. In critical code, prefer distinct names and short blocks.

#include <stdio.h>

int g = 100;

int main(void) {
  int x = 10;
  {
    int x = 20;
    printf("inner x = %d\n", x);
  }
  printf("outer x = %d\n", x);
  printf("global g = %d\n", g);
  return 0;
}

Expected output:

inner x = 20
outer x = 10
global g = 100

External objects

External objects are global variables with storage duration for the whole lifetime of the program. They can be declared in one translation unit with extern and defined in another.

Mutable global state makes maintenance and testing harder. The practical approach is to minimize the amount of external state and document who is allowed to change it.

#include <stdio.h>

extern int g_counter;

void tick(void) {
  g_counter++;
}

int g_counter = 0;

int main(void) {
  tick();
  tick();
  printf("g_counter = %d\n", g_counter);
  return 0;
}

Expected output:

g_counter = 2

Storage classes

Storage classes in C define three things: the lifetime of the object, its scope, and for file-scope objects, its linkage.

  • auto: automatic storage for block-local variables. Lifetime lasts from entry to exit of the block.
  • register: also automatic storage, historically suggesting the compiler keep the variable in a register. Modern compilers decide optimization themselves.
  • extern: declaration of an external object or function defined elsewhere.
  • static: for a local variable, static storage duration across calls; for a file-scope object, internal linkage limited to the current translation unit.

The example below shows all storage classes in one place: auto, register, extern, and two forms of static.

#include <stdio.h>

extern int shared_counter;           // extern: declaration of an external object
static int file_local_total = 100;   // static (file scope): internal linkage
int shared_counter = 10;             // definition of the external object

void use_static_local(void) {
  static int calls = 0;
  calls++;
  printf("static local calls = %d\n", calls);
}

void bump_extern(void) {
  shared_counter++;
  printf("extern shared_counter = %d\n", shared_counter);
}

int main(void) {
  auto int local_auto = 5;
  register int i;
  register int sum = 0;

  printf("auto local_auto = %d\n", local_auto);
  printf("static file_local_total = %d\n", file_local_total);

  for (i = 1; i <= 3; i++) {
    sum += i;
  }
  printf("register sum = %d\n", sum);

  use_static_local();
  use_static_local();
  bump_extern();

  return 0;
}

Expected output:

auto local_auto = 5
static file_local_total = 100
register sum = 6
static local calls = 1
static local calls = 2
extern shared_counter = 11

Pointers in function parameters

If a function needs to modify data owned by the caller, it must receive the address of that object. That pattern is used for swap operations, output parameters, and buffer manipulation.

Before dereferencing a pointer parameter, check it against NULL. For APIs, that is a simple guard against crashes, especially when functions are called from multiple modules or layers.

#include <stdio.h>

void swap_int(int *a, int *b) {
  if (a == NULL || b == NULL) {
    return;
  }

  int tmp = *a;
  *a = *b;
  *b = tmp;
}

int main(void) {
  int x = 3;
  int y = 9;
  swap_int(&x, &y);
  printf("x = %d, y = %d\n", x, y);
  return 0;
}

Expected output:

x = 9, y = 3

Function pointers

A function pointer stores the address of executable code for a function with a specific signature. This is the basis for callbacks, handler tables, and parameterized algorithms.

The basic syntax is return_type (*name)(param_types). Parentheses around *name are required, otherwise the declaration is parsed as a function prototype rather than a pointer variable.

Assignment works both with a bare function name and with an explicit address: fn = add; and fn = &add; are equivalent. Calls are equivalent too: fn(2, 5) and (*fn)(2, 5).

Formal rules
  • The function-pointer type must be compatible with the type of the called function.
  • p(args) and (*p)(args) are equivalent for a function pointer.
  • Pointer arithmetic does not apply to function pointers.
  • Use typedef for complex signatures to reduce declaration mistakes.

In practice, it helps to read such declarations inside out: first the name, then the pointer level, then the parameter list, and finally the result type.

#include <stdio.h>

int add(int a, int b) {
  return a + b;
}

int mul(int a, int b) {
  return a * b;
}

int apply(int (*op)(int, int), int x, int y) {
  if (op == NULL) {
    return 0;
  }
  return op(x, y);
}

int main(void) {
  int (*fn)(int, int) = add;
  int (*table[2])(int, int) = {add, mul};

  printf("fn(add): %d\n", fn(2, 5));
  fn = &mul;
  printf("fn(mul): %d\n", (*fn)(2, 5));
  printf("table[0]: %d\n", apply(table[0], 3, 4));
  printf("table[1]: %d\n", apply(table[1], 3, 4));
  return 0;
}

Expected output:

fn(add): 7
fn(mul): 10
table[0]: 7
table[1]: 12

Function type

In C, a function type and a pointer-to-function type are different things. For readability, complex signatures often get a typedef so a callback type can have a short name.

A function type is useful as an API contract. If the signature changes, the compiler immediately shows every place where the contract was broken.

#include <stdio.h>

typedef int operation_fn(int, int);

int sub(int a, int b) {
  return a - b;
}

int run(operation_fn *op, int x, int y) {
  return op(x, y);
}

int main(void) {
  printf("sub: %d\n", run(sub, 10, 3));
  return 0;
}

Expected output:

sub: 7

Functions as parameters of other functions

Passing a function as a parameter lets you separate the data traversal algorithm from the specific operation applied to each element. That reduces duplication and improves reuse.

A callback parameter usually looks like return_type (*name)(param_types). For more complex APIs, atypedef often makes the contract easier to read.

Formal rules
  • The callback type must be compatible with the actual function that is passed.
  • fn(x) and (*fn)(x) are equivalent for a function pointer.
  • If the callback may be absent, the API should handle NULL explicitly before calling it.
  • Function pointers do not support pointer arithmetic.

As a reading trick, first look at the parameter name, then the pointer level, then the arguments and result.

#include <stddef.h>
#include <stdio.h>

void map_int(int *arr, size_t n, int (*fn)(int)) {
  for (size_t i = 0; i < n; i++) {
    arr[i] = fn(arr[i]);
  }
}

int square_value(int x) {
  return x * x;
}

int main(void) {
  int a[] = {1, 2, 3, 4};
  map_int(a, 4, square_value);
  printf("%d %d %d %d\n", a[0], a[1], a[2], a[3]);
  return 0;
}

Expected output:

1 4 9 16

Function as the result of another function

A function cannot return another function directly, but it can return a pointer to a function. This is a common way to choose a computation strategy based on configuration.

Using a typedef is usually the most readable variant. Without typedef, the syntax is harder to parse, but the rule is the same: the function result is a function pointer.

Formal rules
  • The returned function pointer must be compatible with the actual target function.
  • NULL is the usual way to signal that no suitable function exists.
  • op(args) and (*op)(args) are equivalent for a valid function pointer.
  • Calling through an incompatible function-pointer type is undefined behavior.
#include <stdio.h>

typedef int (*binary_op)(int, int);

int add_op(int a, int b) {
  return a + b;
}

int max_op(int a, int b) {
  return (a > b) ? a : b;
}

binary_op select_op(char mode) {
  if (mode == '+') {
    return add_op;
  }
  if (mode == 'm') {
    return max_op;
  }
  return NULL;
}

int (*select_op_raw(char mode))(int, int) {
  return select_op(mode);
}

int main(void) {
  binary_op op1 = select_op('m');
  int (*op2)(int, int) = select_op_raw('+');

  if (op1 != NULL) {
    printf("select_op('m') -> %d\n", op1(8, 5));
  }
  if (op2 != NULL) {
    printf("select_op_raw('+') -> %d\n", op2(8, 5));
  }
  return 0;
}

Expected output:

select_op('m') -> 8
select_op_raw('+') -> 13

Functions with a variable number of parameters

Variadic functions work through stdarg.h: va_list, va_start, va_arg, and va_end. This is the pattern used by APIs like printf and many logging systems.

How the core varargs pieces are used step by step:

  • va_list: stores the current state of iteration over the unnamed arguments.
  • va_start(ap, last_named_param): initializes the traversal.
  • va_arg(ap, T): reads the next argument as type T and advances the internal state.
  • va_end(ap): closes the traversal and must be called before leaving the function.

If you need to traverse the arguments twice, use va_copy. Because varargs have no full static type checking, any mismatch in the expected protocol can lead to garbage reads and possible undefined behavior.

#include <stdarg.h>
#include <stdio.h>

int sum_ints(size_t n, ...) {
  va_list ap;
  va_start(ap, n);

  int sum = 0;
  for (size_t i = 0; i < n; i++) {
    sum += va_arg(ap, int);
  }

  va_end(ap);
  return sum;
}

int main(void) {
  printf("sum = %d\n", sum_ints(5, 10, 20, 30, 40, 50));
  return 0;
}

Expected output:

sum = 150

Command-line arguments

In the signature main(int argc, char *argv[]), the parameter argc stores the number of arguments, while argv is an array of strings. The element argv[0] usually contains the program name.

When processing arguments, always check bounds and expected formats. You must not read argv[i] when i >= argc, or you step outside the array of pointers.

#include <stdio.h>

int main(int argc, char *argv[]) {
  printf("argc = %d\n", argc);

  for (int i = 0; i < argc; i++) {
    printf("argv[%d] = %s\n", i, argv[i]);
  }

  return 0;
}

Expected output:

# example run: ./app Alice 42
argc = 3
argv[0] = ./app
argv[1] = Alice
argv[2] = 42

Preprocessor

The preprocessor runs before compilation: it inserts headers, expands macros, and controls conditional compilation. It is powerful, but it is also a textual stage, so safety rules matter even more here.

#include directive

The #include directive inserts the contents of a header file at the inclusion point. Angle brackets are usually used for system headers, while quotes are used for project-local headers.

In large codebases, include guards or #pragma once are critical to avoid redefinition problems. Cyclic includes and unnecessary heavy dependencies in public headers are also best avoided.

#include <stdio.h>
#include <string.h>

int main(void) {
  const char *text = "include";
  printf("length = %zu\n", strlen(text));
  return 0;
}

Expected output:

length = 7

#define directive

#define creates textual substitution that the preprocessor performs before C syntax is analyzed. Constant-like macros are often used for buffer sizes, build flags, and compile-time configuration.

Macros do not know types, so for type-safe constants it is often better to use const or enum. For public macro names, project-specific prefixes help avoid collisions.

#include <stdio.h>

#define APP_NAME "c-demo"
#define BUFFER_SIZE 64

int main(void) {
  printf("app = %s\n", APP_NAME);
  printf("buffer = %d\n", BUFFER_SIZE);
  return 0;
}

Expected output:

app = c-demo
buffer = 64

Macros

Function-like macros are used for short templates that must work at preprocessing time. A safe style is to wrap arguments and the entire expression in parentheses, and to use do { ... } while (0) for multi-line macros.

Macros with side effects are dangerous. For example, SQUARE(i++) may expand into an expression that modifies the variable more than once without proper sequencing. In such cases a normal function is safer.

Because the preprocessor performs raw textual substitution, it does not respect operator precedence for you. That is why macros like #define BAD_SQUARE(x) x * x are unsafe and why parentheses are mandatory.

#include <stdio.h>

#define SQUARE(x) ((x) * (x))
#define SWAP_INT(a, b) \
  do { \
    int tmp = (a); \
    (a) = (b); \
    (b) = tmp; \
  } while (0)

int main(void) {
  int x = 2;
  int y = 9;
  int v = SQUARE(x + 1);

  SWAP_INT(x, y);

  printf("v = %d\n", v);
  printf("x = %d, y = %d\n", x, y);
  return 0;
}

Expected output:

v = 9
x = 9, y = 2

Conditional compilation

Conditional compilation with #if, #ifdef, and #ifndef lets you include platform-specific or debug-only code without runtime branching.

It is important to test both branches regularly in CI. Otherwise the rarely used branch degrades quickly and breaks as soon as the flag changes.

#include <stdio.h>

#define DEBUG 1

int main(void) {
  int value = 42;

#if DEBUG
  printf("DEBUG: value = %d\n", value);
#else
  printf("release mode\n");
#endif

  printf("result = %d\n", value * 2);
  return 0;
}

Expected output:

DEBUG: value = 42
result = 84

Builtin macros

Standard builtin macros such as __FILE__, __LINE__, __DATE__, and __TIME__ are useful for diagnostics and simple logging.

These values depend on the build environment and should not participate in critical business logic. They are best kept in debug output and crash logs.

#include <stdio.h>

int main(void) {
  printf("file: %s\n", __FILE__);
  printf("line: %d\n", __LINE__);
  printf("date: %s\n", __DATE__);
  printf("time: %s\n", __TIME__);
  return 0;
}

Expected output:

file: demo.c
line: 6
date: Feb 18 2026
time: 12:34:56
# line/date/time depend on the file and build moment

Structures

Structures group related data into one type and define an explicit model of the domain. This is a core tool for designing interfaces between C modules.

Formal rules
  • Structure fields are laid out in declaration order, though padding may appear between them.
  • . works on a structure object, while -> works on a pointer.
  • Fields in a union share one memory region and overlap each other.
  • The size and placement of bit fields have implementation-defined aspects.

Defining structures

A structure is defined with struct and a list of fields. For convenience, code often adds an alias through typedef so the type can be used without the struct prefix.

Mistakes in field names and field types tend to spread through the whole codebase, so a structure should be treated as a stable contract, especially in serialization formats and inter-module APIs.

#include <stdio.h>

typedef struct {
  double x;
  double y;
} Point;

int main(void) {
  Point p = {3.0, 4.0};
  printf("point = (%.1f, %.1f)\n", p.x, p.y);
  return 0;
}

Expected output:

point = (3.0, 4.0)

Structures inside structures

A structure field can itself be another structure. This is useful for hierarchies: for example, a user may contain an address, and the address may contain a ZIP code and string fields.

With nested structures, it is worth controlling initialization on every level. Partial initialization is allowed, but without explicit intent it often leads to unobvious default values.

#include <stdio.h>

typedef struct {
  const char *city;
  int zip;
} Address;

typedef struct {
  const char *name;
  Address address;
} User;

int main(void) {
  User u = {"Alex", {"Moscow", 101000}};
  printf("%s -> %s, %d\n", u.name, u.address.city, u.address.zip);
  return 0;
}

Expected output:

Alex -> Moscow, 101000

Pointers to structures

Structure pointers are needed when the object is large, must be modified in a function, or lives in dynamic memory. Field access through a pointer is usually written with ->, which is shorthand for (*ptr).field.

When passing a structure pointer, the object lifetime must remain valid. Access through a dangling pointer causes the same sort of undefined behavior as with any other type.

#include <stdio.h>

typedef struct {
  int id;
  double score;
} Record;

void update_score(Record *r, double score) {
  if (r == NULL) {
    return;
  }
  r->score = score;
}

int main(void) {
  Record rec = {1, 0.0};
  Record *p = &rec;

  update_score(p, 98.5);
  printf("via -> : id=%d score=%.1f\n", p->id, p->score);
  printf("via (*p). : id=%d score=%.1f\n", (*p).id, (*p).score);
  return 0;
}

Expected output:

via -> : id=1 score=98.5
via (*p). : id=1 score=98.5

Arrays of structures

An array of structures is convenient for storing homogeneous records: users, points, transactions, or queue tasks. Indexing works the same way as with arrays of primitive values.

With a linear scan over an array of structures, the complexity is still O(n). As data grows, it may be worth designing indexes or sorting with binary search.

#include <stdio.h>

typedef struct {
  const char *name;
  int age;
} Person;

int main(void) {
  Person team[] = {{"Ann", 23}, {"Bob", 29}, {"Kate", 31}};
  size_t n = sizeof(team) / sizeof(team[0]);

  for (size_t i = 0; i < n; i++) {
    printf("%s: %d\n", team[i].name, team[i].age);
  }

  return 0;
}

Expected output:

Ann: 23
Bob: 29
Kate: 31

Structures and functions

A structure can be passed to a function by value or by pointer. Passing by value copies the whole object, while passing by pointer works with the original data without a copy.

For small immutable structures, a copy is often fine. For large structures or frequent calls, it is more efficient to pass const T * for reading and T * for mutation.

#include <stdio.h>

typedef struct {
  int w;
  int h;
} Size;

int area_by_value(Size s) {
  return s.w * s.h;
}

void scale(Size *s, int k) {
  if (s == NULL) {
    return;
  }
  s->w *= k;
  s->h *= k;
}

int main(void) {
  Size s = {3, 4};
  printf("area = %d\n", area_by_value(s));
  scale(&s, 2);
  printf("scaled = %d x %d\n", s.w, s.h);
  return 0;
}

Expected output:

area = 12
scaled = 6 x 8

Layout of structures and their fields in memory

The size of a structure can be larger than the sum of field sizes because of alignment and padding. This affects memory usage, binary protocols, and layout compatibility across compilers.

The key alignment idea is that each field usually needs to begin at an address that is a multiple of its alignment. If the current offset does not match, the compiler inserts padding bytes before the next field.

  • Internal padding: bytes inserted between fields to align the next field.
  • Tail padding: bytes inserted at the end so sizeof(struct) is a multiple of the maximum field alignment.
  • Field order affects the final size, and placing wider fields earlier often reduces padding.

You should not serialize a structure as raw bytes without explicit layout control. For external formats, use explicit field packing, fixed-width types, and validate offsets with offsetof.

In C23, alignof(T) returns the alignment requirement of type T in bytes, which helps explain why the compiler inserts padding.

#include <stddef.h>
#include <stdio.h>

typedef struct {
  char tag;
  int value;
  short code;
} ItemA;

typedef struct {
  int value;
  short code;
  char tag;
} ItemB;

int main(void) {
  printf("alignof(char) = %zu\n", alignof(char));
  printf("alignof(short) = %zu\n", alignof(short));
  printf("alignof(int) = %zu\n", alignof(int));

  printf("sizeof(ItemA) = %zu\n", sizeof(ItemA));
  printf("A offsets: tag=%zu value=%zu code=%zu\n",
         offsetof(ItemA, tag), offsetof(ItemA, value), offsetof(ItemA, code));

  printf("sizeof(ItemB) = %zu\n", sizeof(ItemB));
  printf("B offsets: value=%zu code=%zu tag=%zu\n",
         offsetof(ItemB, value), offsetof(ItemB, code), offsetof(ItemB, tag));
  return 0;
}

Expected output:

alignof(char) = 1
alignof(short) = 2
alignof(int) = 4
sizeof(ItemA) = 12
A offsets: tag=0 value=4 code=8
sizeof(ItemB) = 8
B offsets: value=0 code=4 tag=6
# exact values depend on the ABI and compiler

Compound literals

A compound literal lets you create a temporary object of a given type directly inside an expression, for example (Point){1.0, 2.0}. This reduces boilerplate and keeps function calls compact.

The lifetime of a compound literal depends on context. A literal with block scope has automatic storage, so its address must not be used after leaving the block.

#include <stdio.h>

typedef struct {
  double x;
  double y;
} Point;

void print_point(Point p) {
  printf("(%.1f, %.1f)\n", p.x, p.y);
}

int main(void) {
  print_point((Point){1.5, 2.5});

  int *arr = (int[]){10, 20, 30};
  printf("arr[2] = %d\n", arr[2]);
  return 0;
}

Expected output:

(1.5, 2.5)
arr[2] = 30

Enumerations

Enumerations, enum, define a named set of integer constants. They improve readability and reduce the number of magic numbers in conditions and state machines.

The basic syntax is enum Tag { A, B, C };. Values can be set explicitly or left implicit, in which case each next enumerator is one larger than the previous one.

Even though enumerators are integer in nature, it is still worth validating values converted from external data so the program does not drift into an unhandled state.

Formal rules
  • Enumerator names within a single enumeration must be unique.
  • If no value is given, the enumerator becomes the previous value plus one, with the first defaulting to 0.
  • Explicit enumerator values must be integer constant expressions.
  • Enumerators can be used in switch/case and other contexts that expect integer constants.

Practical style: use explicit values for stable external protocols and keep unknown-value handling in default.

#include <stdio.h>

typedef enum StatusTag {
  STATUS_NEW = 0,
  STATUS_READY = 10,
  STATUS_DONE // 11
} Status;

int main(void) {
  Status st = STATUS_READY;

  printf("STATUS_NEW = %d\n", STATUS_NEW);
  printf("STATUS_READY = %d\n", STATUS_READY);
  printf("STATUS_DONE = %d\n", STATUS_DONE);

  if (st == STATUS_READY) {
    printf("ready\n");
  }

  printf("numeric = %d\n", st);
  return 0;
}

Expected output:

STATUS_NEW = 0
STATUS_READY = 10
STATUS_DONE = 11
ready
numeric = 10

Unions

In a union, all fields share the same memory area. That is useful for memory savings and for representing variant data together with an explicit tag telling you which field is active.

Reading a union field that was not the last one written has limits and may be implementation-defined or undefined depending on the case. The safe route is to store an explicit discriminant and read only the active field.

Formal rules
  • All union fields share the same memory and start at the same address.
  • The size of a union is at least the size of its largest field and usually respects its alignment.
  • It is always valid to read the field that was most recently assigned.
  • Reading a different field requires careful documentation and special-case reasoning.
#include <stdint.h>
#include <stdio.h>

typedef union {
  uint32_t u32;
  unsigned char bytes[4];
} NumberView;

int main(void) {
  NumberView v;
  v.u32 = 0x12345678u;

  printf("u32 = 0x%08x\n", v.u32);
  printf("bytes: %u %u %u %u\n", v.bytes[0], v.bytes[1], v.bytes[2], v.bytes[3]);
  return 0;
}

Expected output:

u32 = 0x12345678
bytes: 120 86 52 18
# byte order depends on platform endianness

Bit fields

Bit fields let you pack a set of flags into one structure object and describe the width of each field in bits. This is useful for device registers and compact internal flag states.

Bit ordering and some alignment details depend on the compiler implementation. For network and file formats, it is usually safer not to rely on bit fields without extra normalization.

The syntax is type name : width;. For example, unsigned read : 1; allocates 1 bit for the flag read.

Formal rules
  • The width of a bit field must be a non-negative integer constant expression.
  • Bit fields are most commonly based on unsigned int, signed int, or int.
  • Bit packing order, alignment, and placement between storage units are implementation-defined.
  • Bit fields are convenient for internal flags, but explicit masks and shifts are safer for portable binary protocols.
#include <stdio.h>

typedef struct {
  unsigned read : 1;
  unsigned write : 1;
  unsigned exec : 1;
  unsigned reserved : 5;
} Flags;

int main(void) {
  Flags f = {1, 0, 1, 0};
  printf("read=%u write=%u exec=%u\n", f.read, f.write, f.exec);
  printf("sizeof(Flags) = %zu\n", sizeof(Flags));
  return 0;
}

Expected output:

read=1 write=0 exec=1
sizeof(Flags) = 4
# size may differ on another platform

Dynamic memory

Dynamic memory in C is managed manually: the programmer allocates heap blocks through allocator APIs and is responsible for freeing them on time. This gives control and performance, but requires discipline.

The allocator API is the set of standard library functions for heap management: malloc, calloc, realloc, and free. In practice it is a contract between your program and the allocator.

Heap memory is the process area used for dynamic allocations at runtime. Unlike the stack, where objects usually live until the current block or function ends, a heap block stays alive until you explicitly free it.

Formal rules
  • malloc, calloc, and realloc return a pointer to a block or NULL on failure.
  • free(NULL) is valid and safe.
  • Freeing the same block twice is undefined behavior.
  • After a successful moving realloc, the old pointer is no longer valid.

Allocating and freeing memory

malloc allocates an uninitialized block, calloc allocates and zero-initializes, and realloc changes the size of an existing block. Release is always done with free.

Allocator API syntax from <stdlib.h>:

  • void *malloc(size_t size); allocates size bytes and returns the block start or NULL.
  • void *calloc(size_t nmemb, size_t size); allocates space for nmemb elements and fills it with zeros.
  • void *realloc(void *ptr, size_t new_size); changes the size of a previously allocated block.
  • void free(void *ptr); releases a block obtained from the allocator API.

The safe pattern for realloc is to use a temporary pointer. If you assign directly into the original pointer and get NULL, you can lose the only reference to the allocated memory and create a leak.

Formal rules
  • Allocated memory belongs to the caller and must be freed exactly once.
  • free(NULL) is valid and safe.
  • Double free and use-after-free are undefined behavior.
  • If realloc fails, it returns NULL and the original block remains valid.
  • Only pointers returned by the allocator API, or NULL, may be passed to realloc or free.
#include <stdio.h>
#include <stdlib.h>

int main(void) {
  size_t n = 3;
  int *arr = malloc(n * sizeof(*arr));
  if (arr == NULL) {
    return 1;
  }

  for (size_t i = 0; i < n; i++) {
    arr[i] = (int)(i + 1) * 10;
  }

  int *tmp = realloc(arr, 5 * sizeof(*arr));
  if (tmp == NULL) {
    free(arr);
    return 1;
  }
  arr = tmp;
  arr[3] = 40;
  arr[4] = 50;

  int *zeroed = calloc(4, sizeof(*zeroed));
  if (zeroed == NULL) {
    free(arr);
    return 1;
  }

  printf("arr: %d %d %d %d %d\n", arr[0], arr[1], arr[2], arr[3], arr[4]);
  printf("zeroed: %d %d %d %d\n", zeroed[0], zeroed[1], zeroed[2], zeroed[3]);

  free(zeroed);
  zeroed = NULL;
  free(arr);
  arr = NULL;
  return 0;
}

Expected output:

arr: 10 20 30 40 50
zeroed: 0 0 0 0

Allocating memory for a two-dimensional array of arbitrary size

For a matrix of runtime-defined size, a common approach is to allocate an array of row pointers and then allocate each row separately. That gives flexibility in the number of rows and columns.

If allocation of a later row fails, you must free all already allocated rows correctly. Otherwise you get a partial memory leak and leave the program in an unstable state.

#include <stdio.h>
#include <stdlib.h>

int **alloc_matrix(size_t rows, size_t cols) {
  int **m = malloc(rows * sizeof(*m));
  if (m == NULL) {
    return NULL;
  }

  for (size_t i = 0; i < rows; i++) {
    m[i] = malloc(cols * sizeof(*m[i]));
    if (m[i] == NULL) {
      for (size_t k = 0; k < i; k++) {
        free(m[k]);
      }
      free(m);
      return NULL;
    }
  }

  return m;
}

void free_matrix(int **m, size_t rows) {
  if (m == NULL) {
    return;
  }
  for (size_t i = 0; i < rows; i++) {
    free(m[i]);
  }
  free(m);
}

int main(void) {
  size_t rows = 3;
  size_t cols = 4;
  int **m = alloc_matrix(rows, cols);
  if (m == NULL) {
    return 1;
  }

  for (size_t i = 0; i < rows; i++) {
    for (size_t j = 0; j < cols; j++) {
      m[i][j] = (int)(i * 10 + j);
    }
  }

  for (size_t i = 0; i < rows; i++) {
    for (size_t j = 0; j < cols; j++) {
      printf("%d ", m[i][j]);
    }
    printf("\n");
  }

  free_matrix(m, rows);
  return 0;
}

Expected output:

0 1 2 3
10 11 12 13
20 21 22 23

Managing dynamic memory

Memory management is not just allocator calls, but also ownership discipline: who creates the block, who must free it, and whether shared ownership between modules is allowed.

Practical rules are to document ownership in the API, release resources in one exit path, null out pointers after free, and avoid double free with helpers such as safe_free.

#include <stdio.h>
#include <stdlib.h>

void safe_free(void **pp) {
  if (pp != NULL && *pp != NULL) {
    free(*pp);
    *pp = NULL;
  }
}

int main(void) {
  int *data = malloc(3 * sizeof(*data));
  if (data == NULL) {
    return 1;
  }

  data[0] = 7;
  data[1] = 8;
  data[2] = 9;
  printf("data[1] = %d\n", data[1]);

  safe_free((void **)&data);
  printf("data == NULL -> %d\n", data == NULL);
  return 0;
}

Expected output:

data[1] = 8
data == NULL -> 1

Pointer as a function result

A function can return a pointer to dynamically allocated memory if the caller takes responsibility for freeing that block. This is a common interface pattern for string builders and buffer factories.

You must not return the address of an automatic local array. A correct return value is either a dynamic block or a pointer to a static object with a clearly documented contract.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *build_message(const char *name) {
  const char *prefix = "Hello, ";
  size_t n = strlen(prefix) + strlen(name) + 1;

  char *msg = malloc(n);
  if (msg == NULL) {
    return NULL;
  }

  snprintf(msg, n, "%s%s", prefix, name);
  return msg;
}

int main(void) {
  char *msg = build_message("C programmer");
  if (msg == NULL) {
    return 1;
  }

  printf("%s\n", msg);
  free(msg);
  return 0;
}

Expected output:

Hello, C programmer

Program memory organization and segment layout

A process address space is usually described in four main areas: the code segment (text), the segment for global and static data (data/bss), the heap, and the stack. This model helps explain why some objects live for the whole program while others exist only until a function returns.

The code/text segment stores machine instructions. data/bss holds globals, static objects, and literals. heap holds blocks obtained through malloc, calloc, and realloc. stack holds call frames, parameters, automatic locals, and call bookkeeping.

Heap and stack usually grow toward each other: the heap toward larger addresses and the stack toward smaller ones. That is exactly why you cannot return the address of an automatic local variable.

The program below illustrates the idea by printing addresses: &local_auto_1 belongs to the stack, heap_value_1 points into the heap, and &global_value,&global_static, and &local_static point into the data segment.

Exact addresses depend on the OS, runtime, and memory protection such as ASLR, so the important part is not the numbers themselves but the relative roles of the memory regions.

#include <stdio.h>
#include <stdlib.h>

int global_value = 1;
static int global_static = 2;

int main(void) {
  static int local_static = 3;
  int local_auto_1 = 1;
  int local_auto_2 = 2;
  int *heap_value_1 = malloc(sizeof(*heap_value_1));
  if (heap_value_1 == NULL) {
    return 1;
  }
  int *heap_value_2 = malloc(sizeof(*heap_value_2));
  if (heap_value_2 == NULL) {
    return 1;
  }

  *heap_value_1 = 1;
  *heap_value_2 = 2;

  printf("&global_value  = %p\n", (void *)&global_value);
  printf("&global_static = %p\n", (void *)&global_static);
  printf("&local_static  = %p\n", (void *)&local_static);
  printf("&local_auto_1    = %p\n", (void *)&local_auto_1);
  printf("&local_auto_2    = %p\n", (void *)&local_auto_2);
  printf("heap_value_1     = %p\n", (void *)heap_value_1);
  printf("heap_value_2     = %p\n", (void *)heap_value_2);

  free(heap_value_1);
  free(heap_value_2);
  return 0;
}

Expected output (the exact addresses depend on the OS, ASLR, and the concrete run):

&global_value  = 0x104ca4000
&global_static = 0x104ca4008
&local_static  = 0x104ca4004
&local_auto_1    = 0x16b16288c
&local_auto_2    = 0x16b162888
heap_value_1     = 0x1052b5ec0
heap_value_2     = 0x1052b5ed0
Diagram of object placement in process memoryVertical memory map: at the top the stack with local_auto_1 and local_auto_2, below it the heap with heap_value_1 and heap_value_2, then bss with global_value, global_static, and local_static, and finally the code text segment at the bottom. High addresses are shown above, low addresses below.High addressesstack&local_auto_1 = 0x16b16288c&local_auto_2 = 0x16b162888stack growthheapheap_value_1 = 0x1052b5ec0heap_value_2 = 0x1052b5ed0heap growthbss&global_value = 0x104ca4000&global_static = 0x104ca4008&local_static = 0x104ca4004code/textinstructions for main(), malloc(), printf(), free()Low addresses
The diagram reflects this specific run and shows the relative placement of stack, heap, bss, and code/text for the printed addresses.