Add Development Guide to Project and implement kv_store.h.

This commit is contained in:
2025-12-19 22:52:11 -07:00
parent 023228e9e8
commit 0c65490a1a
3 changed files with 640 additions and 2 deletions

297
DEVELOPMENT.md Normal file
View File

@@ -0,0 +1,297 @@
# Development Guide
This document captures the design decisions, architecture, and implementation strategy for the kvstore project. Use this when returning to the project or working with an AI assistant.
## Project Philosophy
**Learning First**: This is an educational project prioritizing understanding over optimization.
**Explicit Over Implicit**: C makes you manage memory explicitly. We lean into this—every allocation and free is intentional and documented.
**Batch Processing**: The application runs, executes one command, and exits. No persistent state between invocations (yet).
**String-Only Storage**: Keys and values are always strings. Simple but sufficient for the learning phase.
## Current Design State
### Data Model
```
Entry: key (char*) -> value (char*)
Store: Dynamic array of entries with length and capacity tracking
```
Key insight: The store is not a hash table yet—it's a simple array. This means O(n) lookups, but it's easier to understand and implement. Future evolution: hash table or B-tree.
### Memory Ownership Rules
**Rule 1: Allocation implies ownership**
- Whoever calls `malloc` (directly or indirectly) is responsible for calling `free`
- Functions document what they allocate and return
**Rule 2: Deep copying on boundaries**
- When you pass data to the store, it makes its own copy
- When you retrieve data from the store, you get a copy
- This prevents external code from accidentally modifying internal state
**Rule 3: Pointer parameters indicate mutability**
- `const kv_store_t *store` → function only reads, cannot modify
- `kv_store_t *const store` → can modify store, cannot change the pointer itself
- `kv_store_t *store` → can modify store (rare in read-only functions)
### Error Handling Convention
All kvstore operations return:
```
1 = success / found
0 = not found / created new / no error (but different result)
-1 = error (allocation failed, etc.)
```
Never throw exceptions. Use return codes.
Allocation failures return NULL pointers. Always check before dereferencing.
### Function Patterns
**Creating allocations:**
```c
TYPE *kv_store_entry_init(...)
// Returns pointer to newly allocated struct
// Caller must call kv_store_entry_free() when done
```
**Freeing:**
```c
void kv_store_entry_free(TYPE *ptr)
// Takes pointer to free
// Should be NULL-safe (check for NULL before free)
```
**Reading:**
```c
TYPE *kv_store_get_entry(...)
// Returns newly allocated COPY of entry
// Caller must free the returned pointer
```
**Modifying:**
```c
int kv_store_set_entry(TYPE *store, const TYPE *input)
// Takes pointer to store and const input
// Makes its own copy of input data
// Returns status code (1/0/-1)
```
## Module Breakdown
### kv_store (src/kv_store.c / include/kv_store.h)
**Current State**: Interface complete, implementation needed
**Core operations**:
1. `kv_store_init(capacity)` - Create empty store
2. `kv_store_set_entry(store, entry)` - Add or update
3. `kv_store_get_entry(store, key)` - Retrieve (returns copy)
4. `kv_store_delete_entry(store, key)` - Remove
5. `kv_store_free(store)` - Clean up
**Entry operations**:
1. `kv_store_entry_init(key, value)` - Create entry
2. `kv_store_entry_copy(entry)` - Deep copy
3. `kv_store_entry_free(entry)` - Cleanup
**Implementation considerations**:
- Entries array may need resizing (when length reaches capacity)
- Need way to find entries by key (linear search for now: O(n))
- Deletion might use swap-and-pop or mark-as-deleted
- All strings must be copied, not just pointer assignments
### cli (src/cli.c / include/cli.h)
**Current State**: Interface complete, minimal implementation
**Key functions**:
1. `cli_print_help()` - Display GNU-style help
2. `cli_execute(argc, argv)` - Parse and execute command
3. `cli_print_result(result)` - Format output
**Behavior**:
- `main.c` checks for `--help` or `-h` first, before calling `cli_execute`
- If help found, call `cli_print_help()` and exit(0)
- Otherwise, pass full argv to `cli_execute()`
- `cli_execute()` parses first argv (after program name) as command
- Routes to appropriate store operation
- Returns `cli_result_t` with status and message
**Future commands** (not yet implemented):
```
kvstore get <key> # Retrieve a value
kvstore set <key> <value> # Store a value
kvstore delete <key> # Remove a value
kvstore list # Show all entries
kvstore --help, -h # Show help
```
### string (src/string.c / include/string.h)
**Current State**: Interface complete, implementation needed
**Functions**:
1. `string_copy(src)` - Allocate and copy
2. `string_compare(s1, s2)` - Like strcmp
3. `string_trim(str)` - Copy with whitespace trimmed
4. `string_search(src, search)` - Find substring (returns index or -1)
5. `string_free(str)` - Safe free (NULL-safe)
**Use case**: These are helpers for CLI parsing and kvstore string operations.
## Implementation Roadmap
### Week 1: Foundation
1. Implement `string.c` functions
- `string_copy()` - basic malloc + strcpy wrapper
- `string_compare()` - wrapper for strcmp
- `string_free()` - free with NULL check
- Test manually in main
2. Implement `kv_store.c` basics
- `kv_store_entry_init()` - allocate entry, copy strings
- `kv_store_entry_free()` - free both key and value
- `kv_store_entry_copy()` - allocate and copy
- `kv_store_init()` - allocate store, allocate empty entries array
- `kv_store_free()` - free all entries, then store
3. Test with simple program in main.c
### Week 2: Core CRUD
1. `kv_store_get_entry()` - Linear search, return copy
2. `kv_store_set_entry()` - Find or append, copy input
3. `kv_store_delete_entry()` - Find and remove
4. Add array resizing when capacity is reached
5. Test each operation manually
### Week 3: CLI Integration
1. Implement `cli_print_help()` - GNU-style output
2. Implement `cli_execute()` - Command parsing
3. Wire commands to store operations
4. Update `main.c` to check for help flag
5. Test: `./kvstore --help`, `./kvstore set x y`, `./kvstore get x`
### Week 4+: Polish & Future
1. Add `string_trim()` and `string_search()` if needed
2. Consider testing framework (Unity)
3. File persistence (load/save)
4. Performance improvements (hash table)
## Common Patterns & Idioms
### Safe pointer casting in free functions
```c
void kv_store_entry_free(kv_store_entry_t *entry) {
if (entry == NULL) return; // NULL-safe
string_free(entry->key);
string_free(entry->value);
free(entry);
}
```
### Linear search pattern
```c
for (int i = 0; i < store->length; i++) {
if (string_compare(store->entries[i].key, key) == 0) {
// Found at index i
return i;
}
}
// Not found
return -1;
```
### Array resizing pattern (for future)
```c
if (store->length == store->capacity) {
store->capacity *= 2;
store->entries = realloc(store->entries,
store->capacity * sizeof(kv_store_entry_t));
}
```
### Returning allocated struct
```c
kv_store_entry_t *copy = malloc(sizeof(kv_store_entry_t));
copy->key = string_copy(entry->key);
copy->value = string_copy(entry->value);
return copy; // Caller must free this
```
## Debugging Tips
1. **Valgrind** for memory leaks: `valgrind ./bin/kvstore`
2. **GDB** for stepping: `gdb ./bin/kvstore`
3. **Add printf statements** to trace execution
4. **Check return codes** - most functions signal errors via -1 or NULL
5. **Test edge cases**: NULL inputs, empty store, duplicate keys
## Common Mistakes to Avoid
1. **Forgetting to copy strings** - Just assigning pointers will cause use-after-free
2. **Not NULL-checking allocations** - malloc can fail
3. **Confusion on ownership** - Who allocated this? Who must free it?
4. **Buffer overruns** - When resizing array, check capacity is adequate
5. **Off-by-one in loops** - Easy to miss the last entry or go past array bounds
6. **const correctness** - If you can't modify, declare const
## Testing Strategy (Future)
When you add Unity tests:
```c
// Test entry lifecycle
TEST_ASSERT_NOT_NULL(entry);
TEST_ASSERT_EQUAL_STRING("key", entry->key);
// Test store operations
TEST_ASSERT_EQUAL_INT(1, kv_store_set_entry(store, entry));
TEST_ASSERT_EQUAL_INT(1, kv_store_get_entry(store, "key", &out));
// Test edge cases
TEST_ASSERT_EQUAL_INT(-1, kv_store_get_entry(NULL, "key", &out));
```
## Quick Reference: Function Checklist
### string.c
- [ ] `string_copy()` - Return malloc'd copy or NULL
- [ ] `string_compare()` - 0 if equal, negative/positive if different
- [ ] `string_trim()` - Return malloc'd trimmed copy or NULL
- [ ] `string_search()` - Return index or -1
- [ ] `string_free()` - Free with NULL check
### kv_store.c
- [ ] `kv_store_entry_init()` - Allocate entry, copy strings
- [ ] `kv_store_entry_copy()` - Allocate copy, copy strings
- [ ] `kv_store_entry_free()` - Free key, value, then entry
- [ ] `kv_store_init()` - Allocate store and entries array
- [ ] `kv_store_free()` - Free all entries, then store
- [ ] `kv_store_get_entry()` - Find and return copy (or NULL)
- [ ] `kv_store_set_entry()` - Find/add and copy, return 0/1/-1
- [ ] `kv_store_delete_entry()` - Find/remove, return 1/0/-1
### cli.c
- [ ] `cli_print_help()` - Print usage to stdout
- [ ] `cli_execute()` - Parse argv, call store ops, return result
- [ ] `cli_print_result()` - Format and print result
## Notes for AI Assistants
If you're helping with this project:
1. **Focus on learning**: Provide hints and clarifications, not complete solutions
2. **Respect the design**: Use the patterns and conventions established
3. **Memory safety**: Always think about allocation, ownership, and freeing
4. **Testing mindset**: Suggest test cases and edge cases
5. **Documentation**: Functions should have clear comments
6. **Error handling**: Follow the 1/0/-1 pattern consistently
The goal is building understanding, not just shipping code.

254
README.md
View File

@@ -1,3 +1,253 @@
# C Learning Project: Key-Value store # C Learning Project: Key-Value Store
The purpose of this project is to learn c development from first principles using clang and make with the loose goal of creating an RDBMS. A educational project to learn C development from first principles, working toward building an RDBMS.
## Project Goals
- Learn C development fundamentals: memory management, pointers, build systems, modularity
- Build a simple key-value store CLI application
- Incrementally work toward understanding relational database concepts
## Current Status
**Phase 1: Design & Foundation**
- ✅ Build system (Makefile with clang)
- ✅ Core data structures designed (kvstore, entries)
- ✅ CLI module foundation started
- ✅ String utility library interface designed
- ⏳ Implementation in progress
## Project Structure
```
kvstore/
├── Makefile # Build configuration
├── README.md # This file
├── ARCHITECTURE.md # Design decisions and module overview
├── include/
│ ├── kv_store.h # Key-value store interface
│ ├── cli.h # CLI module interface
│ └── string.h # String utility functions
├── src/
│ ├── main.c # Entry point
│ ├── cli.c # CLI implementation (in progress)
│ ├── kv_store.c # Key-value store implementation (not started)
│ └── string.c # String utilities (not started)
├── build/ # Object files (generated)
└── bin/ # Binary output (generated)
```
## Building
```bash
make # Compile the project
make run # Build and run
make clean # Remove build artifacts
```
## Modules
### KV Store (`include/kv_store.h`)
Core key-value store with:
- **Entry management**: Create, copy, and free individual key-value pairs
- **Store lifecycle**: Initialize and free stores with configurable capacity
- **CRUD operations**: Get, set, delete, and list entries
- **Error handling**: Consistent return codes (1=success, 0=not found, -1=error)
### CLI (`include/cli.h`)
Command-line interface for batch operations:
- Parse command-line arguments
- Execute kvstore commands
- Display help and results
### String Utilities (`include/string.h`)
Helper functions for string operations:
- Copy, compare, trim, and search strings
- Safe memory management for dynamic strings
## Design Principles
- **Ownership is explicit**: Every allocated pointer is owned by someone who must free it
- **Separation of concerns**: kvstore provides data/status; CLI formats and displays
- **Batch mode**: Single execution per program run; persistence through files (future)
- **Error handling**: Consistent, simple return codes rather than exceptions
- **Learning-focused**: Prioritize clarity and understanding over optimization
## Learning Focus Areas
1. **Memory management**: malloc, free, ownership, pointers
2. **C idioms**: Out parameters, return codes, struct lifecycle
3. **Modularity**: Clear interfaces, separation of concerns
4. **Build systems**: Makefiles, compilation, linking
5. **String handling**: C strings, pointer semantics
See `ARCHITECTURE.md` for detailed design decisions and implementation notes.
```
Now for the architecture file:
```
# Architecture & Design
## Data Structures
### kv_store_entry_t
```c
typedef struct {
char *key;
char *value;
} kv_store_entry_t;
```
A single key-value pair. Both key and value are dynamically allocated strings (C's `char *`). This structure is relatively simple and will evolve as we add persistence and type support.
### kv_store_t
```c
typedef struct {
kv_store_entry_t *entries; // Dynamic array of entries
int length; // Current number of entries
int capacity; // Allocated capacity
} kv_store_t;
```
The main store container. Uses a dynamic array (vector-like) for storage. Tracks both used entries and available capacity.
**Design note:** Uses simple array storage for learning purposes. Later evolution might include:
- Hash tables for O(1) lookup
- B-trees for sorted iteration and range queries
- Disk persistence
## Modules
### kv_store (Core Data Structure)
**Status:** Interface designed, implementation pending
**Key functions:**
- `kv_store_entry_init()`: Allocate and initialize an entry
- `kv_store_entry_copy()`: Create a deep copy of an entry
- `kv_store_entry_free()`: Free an entry's memory
- `kv_store_init()`: Create an empty store with initial capacity
- `kv_store_free()`: Free a store and all its entries
- `kv_store_get_entry()`: Retrieve an entry (returns allocated copy)
- `kv_store_set_entry()`: Add or update an entry
- `kv_store_delete_entry()`: Remove an entry
**Design decisions:**
1. **Copying on get**: `get_entry()` returns an allocated copy of the entry. This ensures the caller cannot modify the store's internal state and protects against use-after-free if the store changes.
2. **Copying on set**: When storing an entry, we deep-copy the key/value strings. This prevents external modifications and clarifies ownership.
3. **Error codes**:
- `1` = success/found
- `0` = not found/created new
- `-1` = error
4. **Pointer parameters**: Store operations that modify take `kv_store_t *` (not const). Read-only operations take `const kv_store_t *`.
### CLI (Command-line Interface)
**Status:** Interface designed, implementation pending
**Key functions:**
- `cli_print_help()`: Display usage information and commands
- `cli_execute()`: Parse and execute a command
- `cli_print_result()`: Format and display results
**Design:**
- **Batch mode**: Single execution per program invocation
- **Help handling**: Main checks for `--help` or `-h` before passing to cli_execute
- **GNU-style**: Follow standard CLI conventions for help text and error messages
**Supported commands (planned):**
- `set <key> <value>`: Store a value
- `get <key>`: Retrieve a value
- `delete <key>`: Remove an entry
- `list`: Show all entries
### String Utilities
**Status:** Interface designed, implementation pending
Simple helpers for string operations with safe memory management:
- `string_copy()`: Allocate and copy a string
- `string_compare()`: Compare two strings
- `string_trim()`: Copy with whitespace trimming
- `string_search()`: Find substring
- `string_free()`: Safe free (NULL-safe)
## Implementation Notes
### Memory Management Pattern
The project uses this consistent pattern:
1. **Allocation functions** return pointers and document that the caller owns the memory
2. **Free functions** take pointers and handle NULL safely
3. **Read operations** return allocated copies, not internal references
4. **Modification operations** deep-copy input data to maintain ownership
Example:
```c
// Caller allocates and owns
kv_store_entry_t *entry = kv_store_entry_init("key", "value");
// Store makes its own copy when storing
kv_store_set_entry(store, entry);
// Caller must free their copy
kv_store_entry_free(entry);
// When reading, get a new copy to work with
kv_store_entry_t *retrieved = kv_store_get_entry(store, "key");
// ... use retrieved ...
kv_store_entry_free(retrieved);
```
### Error Handling
C doesn't have exceptions. We use:
- **Return codes** for operational success/failure
- **NULL pointers** to indicate allocation failures
- **Documentation** to clarify what each code means
No exceptions or verbose error messages at the library level—those are CLI concerns.
## Next Implementation Steps
### Phase 1: Core Store (High Priority)
1. Implement `string.c` - string utilities
2. Implement `kv_store.c` - core store operations
3. Write basic tests (using Unity framework)
4. Test with simple program
### Phase 2: CLI (Medium Priority)
1. Implement `cli.c` - help display
2. Implement `cli_execute()` - command parsing and routing
3. Wire commands to store operations
4. Test each command
### Phase 3: Persistence (Future)
1. Add file I/O to load/save stores
2. Consider simple serialization format
3. Handle startup with existing data
### Phase 4: Advanced Features (Future)
1. Internal data structure improvements (hash table, B-tree)
2. Type support (int, float, blob)
3. Transactions or multiple stores
4. Performance optimization
## Testing Strategy
**Future:** Use Unity testing framework
- Unit tests for each module
- Integration tests for CLI commands
- Edge cases: empty store, duplicate keys, NULL inputs
## Lessons Learned & Teaching Points
As you implement, pay attention to:
1. **Pointers and ownership**: Who allocates, who frees?
2. **const correctness**: What can and cannot be modified?
3. **Error propagation**: How do errors bubble up from library to CLI?
4. **Interface design**: How do you make it easy to use correctly and hard to use incorrectly?
5. **Memory safety**: Are there ways this could leak or crash?
This is learning code—clarity and correctness matter more than optimization.

View File

@@ -0,0 +1,91 @@
#ifndef KV_STORE_H
#define KV_STORE_H
/*
* Key/Value structure containing a key and associated value.
*/
typedef struct {
char *key;
char *value;
} kv_store_entry_t;
/*
* Colleciton structure containing the entries of a Key/Value store.
*/
typedef struct {
kv_store_entry_t *entries;
int length; // Current length of store
int capacity; // Capacity of store
} kv_store_t;
/*
* kv_store_entry_init - Initialize a Key/Value Store Entry
* @key: Key for the entry
* @value: Value for the entry
*
* Returns: Pointer to newly allocated kv_store_entry_t, or NULL on failure.
* Caller must free the returned pointer.
*/
kv_store_entry_t *kv_store_entry_init(const char *key, const char *value);
/*
* kv_store_entry_copy - Copy a Key/Value Store Entry
* @entry: Entry to copy
*
* Returns: Pointer to newly allocated copy of kv_store_entry_t, or NULL on
* failure. Caller must free the returned pointer.
*/
kv_store_entry_t *kv_store_entry_copy(const kv_store_entry_t *entry);
/*
* kv_store_entry_free - Free a Key/Value Store Entry
* @entry: Entry to free
*/
void kv_store_entry_free(kv_store_entry_t *entry);
/*
* kv_store_init - Initialize a Key/Value Store
* @capacity: Initial capacity of the store
*
* Returns: Pointer to newly allocated kv_store_t, or NULL on failure. Caller
* must free the returned pointer.
*/
kv_store_t *kv_store_init(int capacity);
/*
* kv_store_free - Free a Key/Value Store
* @store: Store to free
*/
void kv_store_free(kv_store_t *store);
/*
* kv_store_get_entry - Get Key/Value Store Entry from Store
* @store: KV Store
* @key: Key to get from store.
*
* Returns: Pointer to newly allocated copy of the entry, or NULL if not found
* or on failure. Caller must free the returned pointer.
*/
kv_store_entry_t *kv_store_get_entry(const kv_store_t *store, const char *key);
/*
* kv_store_set_entry - Set Key/Value Store Entry to Store
* @store: KV Store
* @key: Key to set or update in store.
*
* Returns: 1 if value is updated, 0 if the value was added, or -1 on
* failure.
*/
int kv_store_set_entry(kv_store_t *const store, const kv_store_entry_t *entry);
/*
* kv_store_delete_entry - Delete Key/Value Store Entry from Store
* @store: KV Store
* @key: Key to delete from store.
*
* Returns: 1 if the value was deleted, 0 if the value was not found, or -1
* on failure.
*/
int kv_store_delete_entry(kv_store_t *const store, const char *key);
#endif