Abstract Data Types: Associative Arrays
Home >
Associative Array Abstract Data Type
What is an associative array?
An associative-array (or map) is a sequential abstract data type in which entries are key-value pairs (instead of being just values) where following rules have to be obeyed:
- any entry value is accessible by its entry key
- a single value can exist at multiple keys
- a single key can only hold a single value
How values are added by key to associative array (map):
What are the operations an associative array requires?
Although, as an abstract data type, associative array comes with no implementation, it requires from data structures that implement it to provide support
for a minimal set of operations:
| Operation |
Arguments |
Returns |
Description |
| clear |
|
void |
Clears associative array of all values. |
| containsKey |
KEY |
bool |
Checks if key exists in associative array. |
| containsValue |
VALUE, COMPARATOR |
bool |
Checks if value exists in associative array. This potentially requires iterating the whole map and applying comparator to see if value matches. AVOID unless map is small! |
| get |
KEY |
VALUE |
Gets value by key. |
| set |
KEY, VALUE |
void |
Sets value by key. |
| removeKey |
KEY |
void |
Removes element by key. |
| removeValue |
VALUE, COMPARATOR |
void |
Removes all elements that match value. This potentially requires iterating the whole list, applying comparator to see if value matches and, every time it matches, deleting entry from map. AVOID unless map is very small! |
| isEmpty |
|
void |
Checks if associative array is empty |
| size |
|
ULONG |
Gets number of entries in associative array |
Where:
- VALUE: value of an associative array entry
- KEY: key at which value is found in associative array
- COMPARATOR: function with signature compare(source, destination): BOOLEAN able to compare values in associative array
What are the data structures that implement associative array?
Most common implementations of associative array are:
- Hash Table
- Strengths:
- By far the fastest in insertion/lookup/deletion of entries: because hash table only needs to compute a hash of entry key and look up in bucket (List at position @ List) that corresponds to hash to add/delete/retrieve that entry. Because bucket ideally contains one entry only, this operation is considered to occur in constant O(1) time.
- By far the smallest in memory consumption: only a List of List.
- Weaknesses:
- Unpredictable iteration, which makes it hard to test.
- It depends on:
- a fast hash function that guarantees even distribution throughout buckets, so that the List inside ideally contains only one entry.
- a reliable comparator function that guarantees there won't be duplicates with same value of key in the map (which would violate above-mentioned map principles)
- Linked Hash Table
- Strengths:
- Almost as fast, just a bit slower (5-10%) than Hash Table on insertion/lookup/deletion of entries: because it needs to maintain the doubly linked list that preserves insertion order as well.
- Predictable iteration, which makes testing easy
- Weaknesses:
- Greater memory consumption (33%) than Hash Table, because it needs to store the doubly linked list that maintains insertion order.
- Same dependency on a good hash/comparator function.
- Red Black Tree
- Strengths:
- Stays ordered by key.
- Predictable iteration, which makes testing easy
- Weaknesses:
- Much slower (6-7 times) than Hash Table in insertion/deletion of entries: because the red black tree always has to rebalance itself, which is a VERY costly operation that may involve the whole data set.
- Greater memory consumption (33%) than Hash Table, because the struct that wraps each value requires more information.
- Depends on a reliable key comparator function.
This time complexity table using Big O Notation
shows advantages and disadvantages of each data structure chosen for each associative array operation:
| Operation |
Hash Table |
Linked Hash Table |
Red Black Tree |
| clear |
O(N) |
O(N) |
O(N) |
| containsKey |
O(1) |
O(1) |
O(log(N)) |
| containsValue |
O(N) |
O(N) |
O(N) |
| get |
O(1) |
O(1) |
O(log(N)) |
| set |
O(1) |
O(1) |
O(log(N)) |
| removeKey |
O(1) |
O(1) |
O(log(N)) |
| removeValue |
O(N) |
O(N) |
O(N) |
| isEmpty |
O(1) |
O(1) |
O(1) |
| size |
O(1) |
O(1) |
O(1) |
Conclusion: Hash Table, with its speed/memory efficiency advantages, should be solution of choice unless one specifically desires structure to be predictably iterated or self-ordered!