diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md new file mode 100644 index 0000000..d73bf97 --- /dev/null +++ b/DEVELOPMENT.md @@ -0,0 +1,297 @@ +# Development Guide + +This document captures the design decisions, architecture, and implementation strategy for the kvstore project. Use this when returning to the project or working with an AI assistant. + +## Project Philosophy + +**Learning First**: This is an educational project prioritizing understanding over optimization. + +**Explicit Over Implicit**: C makes you manage memory explicitly. We lean into this—every allocation and free is intentional and documented. + +**Batch Processing**: The application runs, executes one command, and exits. No persistent state between invocations (yet). + +**String-Only Storage**: Keys and values are always strings. Simple but sufficient for the learning phase. + +## Current Design State + +### Data Model + +``` +Entry: key (char*) -> value (char*) +Store: Dynamic array of entries with length and capacity tracking +``` + +Key insight: The store is not a hash table yet—it's a simple array. This means O(n) lookups, but it's easier to understand and implement. Future evolution: hash table or B-tree. + +### Memory Ownership Rules + +**Rule 1: Allocation implies ownership** +- Whoever calls `malloc` (directly or indirectly) is responsible for calling `free` +- Functions document what they allocate and return + +**Rule 2: Deep copying on boundaries** +- When you pass data to the store, it makes its own copy +- When you retrieve data from the store, you get a copy +- This prevents external code from accidentally modifying internal state + +**Rule 3: Pointer parameters indicate mutability** +- `const kv_store_t *store` → function only reads, cannot modify +- `kv_store_t *const store` → can modify store, cannot change the pointer itself +- `kv_store_t *store` → can modify store (rare in read-only functions) + +### Error Handling Convention + +All kvstore operations return: +``` + 1 = success / found + 0 = not found / created new / no error (but different result) +-1 = error (allocation failed, etc.) +``` + +Never throw exceptions. Use return codes. + +Allocation failures return NULL pointers. Always check before dereferencing. + +### Function Patterns + +**Creating allocations:** +```c +TYPE *kv_store_entry_init(...) +// Returns pointer to newly allocated struct +// Caller must call kv_store_entry_free() when done +``` + +**Freeing:** +```c +void kv_store_entry_free(TYPE *ptr) +// Takes pointer to free +// Should be NULL-safe (check for NULL before free) +``` + +**Reading:** +```c +TYPE *kv_store_get_entry(...) +// Returns newly allocated COPY of entry +// Caller must free the returned pointer +``` + +**Modifying:** +```c +int kv_store_set_entry(TYPE *store, const TYPE *input) +// Takes pointer to store and const input +// Makes its own copy of input data +// Returns status code (1/0/-1) +``` + +## Module Breakdown + +### kv_store (src/kv_store.c / include/kv_store.h) + +**Current State**: Interface complete, implementation needed + +**Core operations**: +1. `kv_store_init(capacity)` - Create empty store +2. `kv_store_set_entry(store, entry)` - Add or update +3. `kv_store_get_entry(store, key)` - Retrieve (returns copy) +4. `kv_store_delete_entry(store, key)` - Remove +5. `kv_store_free(store)` - Clean up + +**Entry operations**: +1. `kv_store_entry_init(key, value)` - Create entry +2. `kv_store_entry_copy(entry)` - Deep copy +3. `kv_store_entry_free(entry)` - Cleanup + +**Implementation considerations**: +- Entries array may need resizing (when length reaches capacity) +- Need way to find entries by key (linear search for now: O(n)) +- Deletion might use swap-and-pop or mark-as-deleted +- All strings must be copied, not just pointer assignments + +### cli (src/cli.c / include/cli.h) + +**Current State**: Interface complete, minimal implementation + +**Key functions**: +1. `cli_print_help()` - Display GNU-style help +2. `cli_execute(argc, argv)` - Parse and execute command +3. `cli_print_result(result)` - Format output + +**Behavior**: +- `main.c` checks for `--help` or `-h` first, before calling `cli_execute` +- If help found, call `cli_print_help()` and exit(0) +- Otherwise, pass full argv to `cli_execute()` +- `cli_execute()` parses first argv (after program name) as command +- Routes to appropriate store operation +- Returns `cli_result_t` with status and message + +**Future commands** (not yet implemented): +``` +kvstore get # Retrieve a value +kvstore set # Store a value +kvstore delete # Remove a value +kvstore list # Show all entries +kvstore --help, -h # Show help +``` + +### string (src/string.c / include/string.h) + +**Current State**: Interface complete, implementation needed + +**Functions**: +1. `string_copy(src)` - Allocate and copy +2. `string_compare(s1, s2)` - Like strcmp +3. `string_trim(str)` - Copy with whitespace trimmed +4. `string_search(src, search)` - Find substring (returns index or -1) +5. `string_free(str)` - Safe free (NULL-safe) + +**Use case**: These are helpers for CLI parsing and kvstore string operations. + +## Implementation Roadmap + +### Week 1: Foundation +1. Implement `string.c` functions + - `string_copy()` - basic malloc + strcpy wrapper + - `string_compare()` - wrapper for strcmp + - `string_free()` - free with NULL check + - Test manually in main + +2. Implement `kv_store.c` basics + - `kv_store_entry_init()` - allocate entry, copy strings + - `kv_store_entry_free()` - free both key and value + - `kv_store_entry_copy()` - allocate and copy + - `kv_store_init()` - allocate store, allocate empty entries array + - `kv_store_free()` - free all entries, then store + +3. Test with simple program in main.c + +### Week 2: Core CRUD +1. `kv_store_get_entry()` - Linear search, return copy +2. `kv_store_set_entry()` - Find or append, copy input +3. `kv_store_delete_entry()` - Find and remove +4. Add array resizing when capacity is reached + +5. Test each operation manually + +### Week 3: CLI Integration +1. Implement `cli_print_help()` - GNU-style output +2. Implement `cli_execute()` - Command parsing +3. Wire commands to store operations +4. Update `main.c` to check for help flag +5. Test: `./kvstore --help`, `./kvstore set x y`, `./kvstore get x` + +### Week 4+: Polish & Future +1. Add `string_trim()` and `string_search()` if needed +2. Consider testing framework (Unity) +3. File persistence (load/save) +4. Performance improvements (hash table) + +## Common Patterns & Idioms + +### Safe pointer casting in free functions +```c +void kv_store_entry_free(kv_store_entry_t *entry) { + if (entry == NULL) return; // NULL-safe + string_free(entry->key); + string_free(entry->value); + free(entry); +} +``` + +### Linear search pattern +```c +for (int i = 0; i < store->length; i++) { + if (string_compare(store->entries[i].key, key) == 0) { + // Found at index i + return i; + } +} +// Not found +return -1; +``` + +### Array resizing pattern (for future) +```c +if (store->length == store->capacity) { + store->capacity *= 2; + store->entries = realloc(store->entries, + store->capacity * sizeof(kv_store_entry_t)); +} +``` + +### Returning allocated struct +```c +kv_store_entry_t *copy = malloc(sizeof(kv_store_entry_t)); +copy->key = string_copy(entry->key); +copy->value = string_copy(entry->value); +return copy; // Caller must free this +``` + +## Debugging Tips + +1. **Valgrind** for memory leaks: `valgrind ./bin/kvstore` +2. **GDB** for stepping: `gdb ./bin/kvstore` +3. **Add printf statements** to trace execution +4. **Check return codes** - most functions signal errors via -1 or NULL +5. **Test edge cases**: NULL inputs, empty store, duplicate keys + +## Common Mistakes to Avoid + +1. **Forgetting to copy strings** - Just assigning pointers will cause use-after-free +2. **Not NULL-checking allocations** - malloc can fail +3. **Confusion on ownership** - Who allocated this? Who must free it? +4. **Buffer overruns** - When resizing array, check capacity is adequate +5. **Off-by-one in loops** - Easy to miss the last entry or go past array bounds +6. **const correctness** - If you can't modify, declare const + +## Testing Strategy (Future) + +When you add Unity tests: +```c +// Test entry lifecycle +TEST_ASSERT_NOT_NULL(entry); +TEST_ASSERT_EQUAL_STRING("key", entry->key); + +// Test store operations +TEST_ASSERT_EQUAL_INT(1, kv_store_set_entry(store, entry)); +TEST_ASSERT_EQUAL_INT(1, kv_store_get_entry(store, "key", &out)); + +// Test edge cases +TEST_ASSERT_EQUAL_INT(-1, kv_store_get_entry(NULL, "key", &out)); +``` + +## Quick Reference: Function Checklist + +### string.c +- [ ] `string_copy()` - Return malloc'd copy or NULL +- [ ] `string_compare()` - 0 if equal, negative/positive if different +- [ ] `string_trim()` - Return malloc'd trimmed copy or NULL +- [ ] `string_search()` - Return index or -1 +- [ ] `string_free()` - Free with NULL check + +### kv_store.c +- [ ] `kv_store_entry_init()` - Allocate entry, copy strings +- [ ] `kv_store_entry_copy()` - Allocate copy, copy strings +- [ ] `kv_store_entry_free()` - Free key, value, then entry +- [ ] `kv_store_init()` - Allocate store and entries array +- [ ] `kv_store_free()` - Free all entries, then store +- [ ] `kv_store_get_entry()` - Find and return copy (or NULL) +- [ ] `kv_store_set_entry()` - Find/add and copy, return 0/1/-1 +- [ ] `kv_store_delete_entry()` - Find/remove, return 1/0/-1 + +### cli.c +- [ ] `cli_print_help()` - Print usage to stdout +- [ ] `cli_execute()` - Parse argv, call store ops, return result +- [ ] `cli_print_result()` - Format and print result + +## Notes for AI Assistants + +If you're helping with this project: + +1. **Focus on learning**: Provide hints and clarifications, not complete solutions +2. **Respect the design**: Use the patterns and conventions established +3. **Memory safety**: Always think about allocation, ownership, and freeing +4. **Testing mindset**: Suggest test cases and edge cases +5. **Documentation**: Functions should have clear comments +6. **Error handling**: Follow the 1/0/-1 pattern consistently + +The goal is building understanding, not just shipping code. \ No newline at end of file diff --git a/README.md b/README.md index e230b85..7c64e55 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,253 @@ -# C Learning Project: Key-Value store +# C Learning Project: Key-Value Store -The purpose of this project is to learn c development from first principles using clang and make with the loose goal of creating an RDBMS. +A educational project to learn C development from first principles, working toward building an RDBMS. + +## Project Goals + +- Learn C development fundamentals: memory management, pointers, build systems, modularity +- Build a simple key-value store CLI application +- Incrementally work toward understanding relational database concepts + +## Current Status + +**Phase 1: Design & Foundation** +- ✅ Build system (Makefile with clang) +- ✅ Core data structures designed (kvstore, entries) +- ✅ CLI module foundation started +- ✅ String utility library interface designed +- ⏳ Implementation in progress + +## Project Structure + +``` +kvstore/ +├── Makefile # Build configuration +├── README.md # This file +├── ARCHITECTURE.md # Design decisions and module overview +├── include/ +│ ├── kv_store.h # Key-value store interface +│ ├── cli.h # CLI module interface +│ └── string.h # String utility functions +├── src/ +│ ├── main.c # Entry point +│ ├── cli.c # CLI implementation (in progress) +│ ├── kv_store.c # Key-value store implementation (not started) +│ └── string.c # String utilities (not started) +├── build/ # Object files (generated) +└── bin/ # Binary output (generated) +``` + +## Building + +```bash +make # Compile the project +make run # Build and run +make clean # Remove build artifacts +``` + +## Modules + +### KV Store (`include/kv_store.h`) +Core key-value store with: +- **Entry management**: Create, copy, and free individual key-value pairs +- **Store lifecycle**: Initialize and free stores with configurable capacity +- **CRUD operations**: Get, set, delete, and list entries +- **Error handling**: Consistent return codes (1=success, 0=not found, -1=error) + +### CLI (`include/cli.h`) +Command-line interface for batch operations: +- Parse command-line arguments +- Execute kvstore commands +- Display help and results + +### String Utilities (`include/string.h`) +Helper functions for string operations: +- Copy, compare, trim, and search strings +- Safe memory management for dynamic strings + +## Design Principles + +- **Ownership is explicit**: Every allocated pointer is owned by someone who must free it +- **Separation of concerns**: kvstore provides data/status; CLI formats and displays +- **Batch mode**: Single execution per program run; persistence through files (future) +- **Error handling**: Consistent, simple return codes rather than exceptions +- **Learning-focused**: Prioritize clarity and understanding over optimization + +## Learning Focus Areas + +1. **Memory management**: malloc, free, ownership, pointers +2. **C idioms**: Out parameters, return codes, struct lifecycle +3. **Modularity**: Clear interfaces, separation of concerns +4. **Build systems**: Makefiles, compilation, linking +5. **String handling**: C strings, pointer semantics + +See `ARCHITECTURE.md` for detailed design decisions and implementation notes. +``` + +Now for the architecture file: + +``` +# Architecture & Design + +## Data Structures + +### kv_store_entry_t +```c +typedef struct { + char *key; + char *value; +} kv_store_entry_t; +``` + +A single key-value pair. Both key and value are dynamically allocated strings (C's `char *`). This structure is relatively simple and will evolve as we add persistence and type support. + +### kv_store_t +```c +typedef struct { + kv_store_entry_t *entries; // Dynamic array of entries + int length; // Current number of entries + int capacity; // Allocated capacity +} kv_store_t; +``` + +The main store container. Uses a dynamic array (vector-like) for storage. Tracks both used entries and available capacity. + +**Design note:** Uses simple array storage for learning purposes. Later evolution might include: +- Hash tables for O(1) lookup +- B-trees for sorted iteration and range queries +- Disk persistence + +## Modules + +### kv_store (Core Data Structure) + +**Status:** Interface designed, implementation pending + +**Key functions:** +- `kv_store_entry_init()`: Allocate and initialize an entry +- `kv_store_entry_copy()`: Create a deep copy of an entry +- `kv_store_entry_free()`: Free an entry's memory +- `kv_store_init()`: Create an empty store with initial capacity +- `kv_store_free()`: Free a store and all its entries +- `kv_store_get_entry()`: Retrieve an entry (returns allocated copy) +- `kv_store_set_entry()`: Add or update an entry +- `kv_store_delete_entry()`: Remove an entry + +**Design decisions:** +1. **Copying on get**: `get_entry()` returns an allocated copy of the entry. This ensures the caller cannot modify the store's internal state and protects against use-after-free if the store changes. +2. **Copying on set**: When storing an entry, we deep-copy the key/value strings. This prevents external modifications and clarifies ownership. +3. **Error codes**: + - `1` = success/found + - `0` = not found/created new + - `-1` = error +4. **Pointer parameters**: Store operations that modify take `kv_store_t *` (not const). Read-only operations take `const kv_store_t *`. + +### CLI (Command-line Interface) + +**Status:** Interface designed, implementation pending + +**Key functions:** +- `cli_print_help()`: Display usage information and commands +- `cli_execute()`: Parse and execute a command +- `cli_print_result()`: Format and display results + +**Design:** +- **Batch mode**: Single execution per program invocation +- **Help handling**: Main checks for `--help` or `-h` before passing to cli_execute +- **GNU-style**: Follow standard CLI conventions for help text and error messages + +**Supported commands (planned):** +- `set `: Store a value +- `get `: Retrieve a value +- `delete `: Remove an entry +- `list`: Show all entries + +### String Utilities + +**Status:** Interface designed, implementation pending + +Simple helpers for string operations with safe memory management: +- `string_copy()`: Allocate and copy a string +- `string_compare()`: Compare two strings +- `string_trim()`: Copy with whitespace trimming +- `string_search()`: Find substring +- `string_free()`: Safe free (NULL-safe) + +## Implementation Notes + +### Memory Management Pattern + +The project uses this consistent pattern: +1. **Allocation functions** return pointers and document that the caller owns the memory +2. **Free functions** take pointers and handle NULL safely +3. **Read operations** return allocated copies, not internal references +4. **Modification operations** deep-copy input data to maintain ownership + +Example: +```c +// Caller allocates and owns +kv_store_entry_t *entry = kv_store_entry_init("key", "value"); + +// Store makes its own copy when storing +kv_store_set_entry(store, entry); + +// Caller must free their copy +kv_store_entry_free(entry); + +// When reading, get a new copy to work with +kv_store_entry_t *retrieved = kv_store_get_entry(store, "key"); +// ... use retrieved ... +kv_store_entry_free(retrieved); +``` + +### Error Handling + +C doesn't have exceptions. We use: +- **Return codes** for operational success/failure +- **NULL pointers** to indicate allocation failures +- **Documentation** to clarify what each code means + +No exceptions or verbose error messages at the library level—those are CLI concerns. + +## Next Implementation Steps + +### Phase 1: Core Store (High Priority) +1. Implement `string.c` - string utilities +2. Implement `kv_store.c` - core store operations +3. Write basic tests (using Unity framework) +4. Test with simple program + +### Phase 2: CLI (Medium Priority) +1. Implement `cli.c` - help display +2. Implement `cli_execute()` - command parsing and routing +3. Wire commands to store operations +4. Test each command + +### Phase 3: Persistence (Future) +1. Add file I/O to load/save stores +2. Consider simple serialization format +3. Handle startup with existing data + +### Phase 4: Advanced Features (Future) +1. Internal data structure improvements (hash table, B-tree) +2. Type support (int, float, blob) +3. Transactions or multiple stores +4. Performance optimization + +## Testing Strategy + +**Future:** Use Unity testing framework +- Unit tests for each module +- Integration tests for CLI commands +- Edge cases: empty store, duplicate keys, NULL inputs + +## Lessons Learned & Teaching Points + +As you implement, pay attention to: +1. **Pointers and ownership**: Who allocates, who frees? +2. **const correctness**: What can and cannot be modified? +3. **Error propagation**: How do errors bubble up from library to CLI? +4. **Interface design**: How do you make it easy to use correctly and hard to use incorrectly? +5. **Memory safety**: Are there ways this could leak or crash? + +This is learning code—clarity and correctness matter more than optimization. \ No newline at end of file diff --git a/include/kv_store.h b/include/kv_store.h index e69de29..9e884ba 100644 --- a/include/kv_store.h +++ b/include/kv_store.h @@ -0,0 +1,91 @@ +#ifndef KV_STORE_H +#define KV_STORE_H + +/* + * Key/Value structure containing a key and associated value. + */ +typedef struct { + char *key; + char *value; +} kv_store_entry_t; + +/* + * Colleciton structure containing the entries of a Key/Value store. + */ +typedef struct { + kv_store_entry_t *entries; + int length; // Current length of store + int capacity; // Capacity of store +} kv_store_t; + +/* + * kv_store_entry_init - Initialize a Key/Value Store Entry + * @key: Key for the entry + * @value: Value for the entry + * + * Returns: Pointer to newly allocated kv_store_entry_t, or NULL on failure. + * Caller must free the returned pointer. + */ +kv_store_entry_t *kv_store_entry_init(const char *key, const char *value); + +/* + * kv_store_entry_copy - Copy a Key/Value Store Entry + * @entry: Entry to copy + * + * Returns: Pointer to newly allocated copy of kv_store_entry_t, or NULL on + * failure. Caller must free the returned pointer. + */ +kv_store_entry_t *kv_store_entry_copy(const kv_store_entry_t *entry); + +/* + * kv_store_entry_free - Free a Key/Value Store Entry + * @entry: Entry to free + */ +void kv_store_entry_free(kv_store_entry_t *entry); + +/* + * kv_store_init - Initialize a Key/Value Store + * @capacity: Initial capacity of the store + * + * Returns: Pointer to newly allocated kv_store_t, or NULL on failure. Caller + * must free the returned pointer. + */ +kv_store_t *kv_store_init(int capacity); + +/* + * kv_store_free - Free a Key/Value Store + * @store: Store to free + */ +void kv_store_free(kv_store_t *store); + +/* + * kv_store_get_entry - Get Key/Value Store Entry from Store + * @store: KV Store + * @key: Key to get from store. + * + * Returns: Pointer to newly allocated copy of the entry, or NULL if not found + * or on failure. Caller must free the returned pointer. + */ +kv_store_entry_t *kv_store_get_entry(const kv_store_t *store, const char *key); + +/* + * kv_store_set_entry - Set Key/Value Store Entry to Store + * @store: KV Store + * @key: Key to set or update in store. + * + * Returns: 1 if value is updated, 0 if the value was added, or -1 on + * failure. + */ +int kv_store_set_entry(kv_store_t *const store, const kv_store_entry_t *entry); + +/* + * kv_store_delete_entry - Delete Key/Value Store Entry from Store + * @store: KV Store + * @key: Key to delete from store. + * + * Returns: 1 if the value was deleted, 0 if the value was not found, or -1 + * on failure. + */ +int kv_store_delete_entry(kv_store_t *const store, const char *key); + +#endif