diff --git a/README.md b/README.md
index 548305b..bc913b1 100644
--- a/README.md
+++ b/README.md
@@ -1,27 +1,30 @@
-# Datum [](https://github.com/ceticamarco/datum/actions/workflows/datum.yml)
+
+
Datum
+ Collection of dynamic and generic data structures.
+
+[](https://github.com/ceticamarco/datum/actions/workflows/datum.yml)
+
+
Datum is a collection of dynamic and generic data structures implemented from scratch in C with no external dependencies beyond
the standard library. It currently features:
-- **Vector**: a growable, contiguous array supporting homogenous data types (both primitives and user-defined types);
+- **Vector**: a growable, contiguous array of homogenous generic data types;
- **Map**: an associative array that handles generic heterogenous data types;
-To learn more about the memory model of this library as well as the technical details
-on how to use it efficiently and safely, be sure to read [the design manual](docs/manual.pdf).
-
## Usage
At its simplest, you can use this library as follows:
-### `Vector`
+### `Vector`'s usage
```c
#include
#include "src/vector.h"
/*
-* Compile with: gcc main.c src/vector.c
-* Output: First element: 5
-* Head of vector 6, size is now: 1
-*/
+ * Compile with: gcc main.c src/vector.c
+ * Output: First element: 5
+ * Head of vector 6, size is now: 1
+ */
int main(void) {
// Create an integer vector of initial capacity equal to 5
@@ -30,7 +33,8 @@ int main(void) {
// Add two numbers
int val = 5;
vector_push(vec, &val);
- vector_push(vec, &(int){6}); // Equivalent as above
+ // Equivalent as above
+ vector_push(vec, &(int){6});
// Print 1st element
const int first = *(int*)vector_get(vec, 0).value.element;
@@ -47,7 +51,7 @@ int main(void) {
}
```
-### `Map`
+### `Map`'s usage
```c
#include
@@ -60,16 +64,15 @@ typedef struct {
} Person;
/*
-* Compile with: gcc main.c src/map.c
-* Output: Name: Bob, Surname: Smith, Age: 34
-*/
+ * Compile with: gcc main.c src/map.c
+ * Output: Name: Bob, Surname: Smith, Age: 34
+ */
int main(void) {
// Create a new map
map_t *map = map_new().value.map;
- const Person bob = { .name = "Bob", .surname = "Smith", .age = 34 };
-
// Add a key to the map
+ const Person bob = { .name = "Bob", .surname = "Smith", .age = 34 };
map_add(map, "bob", (void*)&bob);
// Retrieve 'Bob' and check if it exists
@@ -77,8 +80,12 @@ int main(void) {
if (bob_res.status == MAP_ERR_NOT_FOUND) {
puts("This key does not exist.");
} else {
- const Person *retr = (const Person*)bob_res.value.element;
- printf("Name: %s, Surname: %s, Age: %d\n", retr->name, retr->surname, retr->age);
+ const Person *ret = (const Person*)bob_res.value.element;
+ printf("Name: %s, Surname: %s, Age: %d\n",
+ ret->name,
+ ret->surname,
+ ret->age
+ );
}
// Remove map from memory
@@ -98,21 +105,142 @@ $ make clean all
This will compile the library as well as the `usage.c` file and the unit tests. After that, you can run it by typing `./usage`.
## Technical Details
-As stated earlier, refer to [the design manual](docs/manual.pdf) for a comprehensive documentation of this library. Below, there's a quick
+In this section, you can find a quick overview of the technical aspects (internal design, memory layout, etc.) of this library as well as an
overview about the design choices behind Datum. While both structures use `void*` to represent values, the way they manage memory is orthogonally different
from one another. Let's start with the `Map` data type.
+### Map
`Map` is an hash table implementation that uses open addressing with linear probing for collision resolution and the
[FNV-1a algorithm](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function) as its hashing function. Resizing is performed automatically
-by doubling the capacity when load factor exceeds 75%. The keys are **copied** by the hashmap. This means that the hashmap **owns** them and is responsible
-to manage their memory. Values, on the other hand, **are stored as pointers**. This means that the hashmap **does NOT own** them and the caller is responsible
-to manage their memory; this includes: allocate enough memory for them, ensure that the pointers remain valid for their whole lifecycle on the map,
+by doubling the capacity when load factor exceeds 75%. Internally, this data structure is represented
+by the following structures:
+
+```c
+typedef struct {
+ char *key;
+ void *value;
+ element_state_t state;
+} map_element_t;
+
+typedef struct {
+ map_element_t *elements;
+ size_t capacity;
+ size_t size;
+ size_t tombstone_count;
+} map_t;
+```
+where the `key` represent a string used to index the `value`. The state, instead, indicates
+whether the entry is empty, occupied or deleted and is primarily used by the garbage collector
+for internal memory management. An array of `map_element_t` as well as variables indicating
+the capacity, the current size and the tombstone count (that is, the number of deleted entries)
+forms a `map_t` data type.
+
+The keys are **copied** by the hashmap. This means that the hashmap **owns** them and is responsible
+to manage their memory. Values, on the other hand, **are stored as pointers**. This means that the hashmap **does NOT own them** and that the caller is responsible
+for managing their memory; this includes: allocate enough memory for them, ensure that the pointers remain valid for their whole lifecycle on the map,
delete old values when updating a key and, if the values were heap-allocated, free them before removing them or before destroying the map.
-`Vector`, instead, is a dynamic array with generic data type support. This means that you can store any kind of homogenous value on the data structure. As in the `Map`'s case,
-resizing is performed automatically by increasing the capacity by 1.5 times when the array is full. The dynamic array copies the values upon insertion, thus it is responsible
+The `Map` data structures supports the following methods:
+
+- `map_result_t map_new()`: initialize a new map;
+- `map_result_t map_add(map, key, value)`: add a `(key, value)` pair to the map;
+- `map_result_t map_get(map, key)`: retrieve a values indexed by `key` if it exists;
+- `map_result_t map_remove(map, key)`: remove a key from the map if it exists;
+- `map_result_t map_clear(map)`: reset the map state;
+- `map_result_t map_destroy(map)`: delete the map;
+- `size_t map_size(map)`: returns map size (i.e., the number of elements);
+- `size_t map_capacity(map)`: returns map capacity (i.e., map total size).
+
+As you can see, most methods that operates on the `Map` data type return a custom type called `map_result_t` which is defined as follows:
+
+```c
+typedef enum {
+ MAP_OK = 0x0,
+ MAP_ERR_ALLOCATE,
+ MAP_ERR_INVALID,
+ MAP_ERR_NOT_FOUND
+} map_status_t;
+
+typedef struct {
+ map_status_t status;
+ uint8_t message[RESULT_MSG_SIZE];
+ union {
+ map_t *map;
+ void *element;
+ } value;
+} map_result_t;
+```
+
+Each method that returns a `map_result_t` indicates whether the operation was successful or not by setting the `status` field and by providing a descriptive message on the `message` field.
+If the operation was successful (that is, `status == MAP_OK`), you can either move on with the flow
+of the program or read the returned
+value from the sum data type. Of course,
+you can choose to ignore the return value (if you're brave enough :D), as illustrated in the first example of this document.
+
+### Vector
+`Vector` is a dynamic array with generic data type support, this means that you can store any kind of homogenous value on this data structure. As in the `Map`'s case,
+resizing is performed automatically by increasing the capacity by 1.5 times when the array is full. Internally, this data structure is represented as follows:
+
+```c
+typedef struct {
+ size_t count;
+ size_t capacity;
+ size_t data_size;
+ void *elements;
+} vector_t;
+```
+
+where the `elements` represents the actual dynamic and generic array, the `data_size`
+variable indicates the size (in bytes) of the data type while the count and
+the capacity represent the number of stored elements and the total
+size of the structure, respectively. The dynamic array copies the values upon
+insertion, thus **it owns the data** and is therefore responsible for their
+allocation and their deletion.
+
+The dynamic array copies the values upon insertion, thus it is responsible
for their allocation and their deletion.
+The `Vector` data structure supports the following methods:
+
+- `vector_result_t vector_new(size, data_size)`: create a new vector;
+- `vector_result_t vector_push(vector, value)`: add a new value to the vector;
+- `vector_result_t vector_set(vector, index, value)`: update the value of a given index if it exists;
+- `vector_result_t vector_get(vector, index)`: return the value indexed by `index` if it exists;
+- `vector_result_t vector_pop(vector)`: pop last element from the vector following the LIFO policy;
+- `vector_result_t vector_clear(vector)`: logically reset the vector. That is, new pushes
+will overwrite the memory;
+- `vector_result_t vector_destroy(vector)`: delete the vector;
+- `size_t vector_size(vector)`: return vector size (i.e., the number of elements);
+- `size_t vector_capacity(vector)`: return vector capacity (i.e., vector total size).
+
+As you can see, most methods that operates on the `Vector` data type return a custom type called
+`vector_result_t` which is defined as follows:
+
+```c
+typedef enum {
+ VECTOR_OK = 0x0,
+ VECTOR_ERR_ALLOCATE,
+ VECTOR_ERR_OVERFLOW,
+ VECTOR_ERR_UNDERFLOW,
+ VECTOR_ERR_INVALID
+} vector_status_t;
+
+typedef struct {
+ vector_status_t status;
+ uint8_t message[RESULT_MSG_SIZE];
+ union {
+ vector_t *vector;
+ void *element;
+ } value;
+} vector_result_t;
+```
+
+Each method that returns such type indicates whether the operation was successful or not by
+setting the `status` field and by providing a descriptive message on the `message` field.
+Just like for the `Map` data structure, if the operation was successful
+(that is, `status == VECTOR_OK`), you can either move on with the rest of the program
+or read the returned value from the sum data type.
+
## Unit tests
Datum provides some unit tests for both the `Vector` and the `Map` data types. To run them, you can issue the following commands:
@@ -124,4 +252,4 @@ $ ./test_map
## License
This library is released under the GPLv3 license. You can find a copy of the license with this repository or by visiting
-[the following link](https://choosealicense.com/licenses/gpl-3.0/).
\ No newline at end of file
+[the following link](https://choosealicense.com/licenses/gpl-3.0/).