C Learning Project: Key-Value Store

A educational project to learn C development from first principles, working toward building an RDBMS.

Project Goals

  • Learn C development fundamentals: memory management, pointers, build systems, modularity
  • Build a simple key-value store CLI application
  • Incrementally work toward understanding relational database concepts

Current Status

Phase 1: Design & Foundation

  • Build system (Makefile with clang)
  • Core data structures designed (kvstore, entries)
  • CLI module foundation started
  • String utility library interface designed
  • Implementation in progress

Project Structure

kvstore/
├── Makefile              # Build configuration
├── README.md             # This file
├── ARCHITECTURE.md       # Design decisions and module overview
├── include/
│   ├── kv_store.h       # Key-value store interface
│   ├── cli.h            # CLI module interface
│   └── string.h         # String utility functions
├── src/
│   ├── main.c           # Entry point
│   ├── cli.c            # CLI implementation (in progress)
│   ├── kv_store.c       # Key-value store implementation (not started)
│   └── string.c         # String utilities (not started)
├── build/               # Object files (generated)
└── bin/                 # Binary output (generated)

Building

make        # Compile the project
make run    # Build and run
make clean  # Remove build artifacts

Modules

KV Store (include/kv_store.h)

Core key-value store with:

  • Entry management: Create, copy, and free individual key-value pairs
  • Store lifecycle: Initialize and free stores with configurable capacity
  • CRUD operations: Get, set, delete, and list entries
  • Error handling: Consistent return codes (1=success, 0=not found, -1=error)

CLI (include/cli.h)

Command-line interface for batch operations:

  • Parse command-line arguments
  • Execute kvstore commands
  • Display help and results

String Utilities (include/string.h)

Helper functions for string operations:

  • Copy, compare, trim, and search strings
  • Safe memory management for dynamic strings

Design Principles

  • Ownership is explicit: Every allocated pointer is owned by someone who must free it
  • Separation of concerns: kvstore provides data/status; CLI formats and displays
  • Batch mode: Single execution per program run; persistence through files (future)
  • Error handling: Consistent, simple return codes rather than exceptions
  • Learning-focused: Prioritize clarity and understanding over optimization

Learning Focus Areas

  1. Memory management: malloc, free, ownership, pointers
  2. C idioms: Out parameters, return codes, struct lifecycle
  3. Modularity: Clear interfaces, separation of concerns
  4. Build systems: Makefiles, compilation, linking
  5. String handling: C strings, pointer semantics

See ARCHITECTURE.md for detailed design decisions and implementation notes.


Now for the architecture file:

Architecture & Design

Data Structures

kv_store_entry_t

typedef struct {
  char *key;
  char *value;
} kv_store_entry_t;

A single key-value pair. Both key and value are dynamically allocated strings (C's char *). This structure is relatively simple and will evolve as we add persistence and type support.

kv_store_t

typedef struct {
  kv_store_entry_t *entries;  // Dynamic array of entries
  int length;                  // Current number of entries
  int capacity;                // Allocated capacity
} kv_store_t;

The main store container. Uses a dynamic array (vector-like) for storage. Tracks both used entries and available capacity.

Design note: Uses simple array storage for learning purposes. Later evolution might include:

  • Hash tables for O(1) lookup
  • B-trees for sorted iteration and range queries
  • Disk persistence

Modules

kv_store (Core Data Structure)

Status: Interface designed, implementation pending

Key functions:

  • kv_store_entry_init(): Allocate and initialize an entry
  • kv_store_entry_copy(): Create a deep copy of an entry
  • kv_store_entry_free(): Free an entry's memory
  • kv_store_init(): Create an empty store with initial capacity
  • kv_store_free(): Free a store and all its entries
  • kv_store_get_entry(): Retrieve an entry (returns allocated copy)
  • kv_store_set_entry(): Add or update an entry
  • kv_store_delete_entry(): Remove an entry

Design decisions:

  1. Copying on get: get_entry() returns an allocated copy of the entry. This ensures the caller cannot modify the store's internal state and protects against use-after-free if the store changes.
  2. Copying on set: When storing an entry, we deep-copy the key/value strings. This prevents external modifications and clarifies ownership.
  3. Error codes:
    • 1 = success/found
    • 0 = not found/created new
    • -1 = error
  4. Pointer parameters: Store operations that modify take kv_store_t * (not const). Read-only operations take const kv_store_t *.

CLI (Command-line Interface)

Status: Interface designed, implementation pending

Key functions:

  • cli_print_help(): Display usage information and commands
  • cli_execute(): Parse and execute a command
  • cli_print_result(): Format and display results

Design:

  • Batch mode: Single execution per program invocation
  • Help handling: Main checks for --help or -h before passing to cli_execute
  • GNU-style: Follow standard CLI conventions for help text and error messages

Supported commands (planned):

  • set <key> <value>: Store a value
  • get <key>: Retrieve a value
  • delete <key>: Remove an entry
  • list: Show all entries

String Utilities

Status: Interface designed, implementation pending

Simple helpers for string operations with safe memory management:

  • string_copy(): Allocate and copy a string
  • string_compare(): Compare two strings
  • string_trim(): Copy with whitespace trimming
  • string_search(): Find substring
  • string_free(): Safe free (NULL-safe)

Implementation Notes

Memory Management Pattern

The project uses this consistent pattern:

  1. Allocation functions return pointers and document that the caller owns the memory
  2. Free functions take pointers and handle NULL safely
  3. Read operations return allocated copies, not internal references
  4. Modification operations deep-copy input data to maintain ownership

Example:

// Caller allocates and owns
kv_store_entry_t *entry = kv_store_entry_init("key", "value");

// Store makes its own copy when storing
kv_store_set_entry(store, entry);

// Caller must free their copy
kv_store_entry_free(entry);

// When reading, get a new copy to work with
kv_store_entry_t *retrieved = kv_store_get_entry(store, "key");
// ... use retrieved ...
kv_store_entry_free(retrieved);

Error Handling

C doesn't have exceptions. We use:

  • Return codes for operational success/failure
  • NULL pointers to indicate allocation failures
  • Documentation to clarify what each code means

No exceptions or verbose error messages at the library level—those are CLI concerns.

Next Implementation Steps

Phase 1: Core Store (High Priority)

  1. Implement string.c - string utilities
  2. Implement kv_store.c - core store operations
  3. Write basic tests (using Unity framework)
  4. Test with simple program

Phase 2: CLI (Medium Priority)

  1. Implement cli.c - help display
  2. Implement cli_execute() - command parsing and routing
  3. Wire commands to store operations
  4. Test each command

Phase 3: Persistence (Future)

  1. Add file I/O to load/save stores
  2. Consider simple serialization format
  3. Handle startup with existing data

Phase 4: Advanced Features (Future)

  1. Internal data structure improvements (hash table, B-tree)
  2. Type support (int, float, blob)
  3. Transactions or multiple stores
  4. Performance optimization

Testing Strategy

Future: Use Unity testing framework

  • Unit tests for each module
  • Integration tests for CLI commands
  • Edge cases: empty store, duplicate keys, NULL inputs

Lessons Learned & Teaching Points

As you implement, pay attention to:

  1. Pointers and ownership: Who allocates, who frees?
  2. const correctness: What can and cannot be modified?
  3. Error propagation: How do errors bubble up from library to CLI?
  4. Interface design: How do you make it easy to use correctly and hard to use incorrectly?
  5. Memory safety: Are there ways this could leak or crash?

This is learning code—clarity and correctness matter more than optimization.

Description
No description provided
Readme MIT 39 KiB
Languages
C 84%
Makefile 16%