You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

533 lines
13 KiB

4 years ago
//! # smallmap
//! A small table map with a byte sized key index.
//!
//! With a key type which all invariants can be represented as unique bytes, searching this map is a single index dereference.
//! With only a few bytes it is still very efficient.
//!
//! ## Usage
//! The API is a similar subset to `HashMap`, containing the same `insert`, `get`, and `entry` functions:
//!
//! ```
//! # use smallmap::Map;
//! fn max_char(chars: &str) -> (char, usize)
//! {
//! let mut map = Map::new();
//! for x in chars.chars() {
4 years ago
//! *map.entry(x).or_insert(0usize) += 1;
4 years ago
//! }
//!
4 years ago
//! map.into_iter().max_by_key(|&(_, v)| v).unwrap_or_default()
4 years ago
//! }
//! ```
//!
//! ## Use cases
//! Designed for instances where you want a small map with small key types.
//! Performance greately outpaces complex hash-based maps in these cases.
//!
//! ### When not to use
//! Generally don't use this if your key would have a lot of collisions being represents in 8 bits, otherwise it might be a faster alternative to hash-based maps. You should check yourself before sticking with this crate instead of `std`'s vectorised map implementations.
4 years ago
4 years ago
#![cfg_attr(nightly, feature(test))]
#![cfg_attr(nightly, feature(drain_filter))]
#![cfg_attr(nightly, feature(const_fn))]
3 years ago
#![cfg_attr(nightly, feature(never_type))]
4 years ago
4 years ago
#[cfg(nightly)] extern crate test;
4 years ago
const MAX: usize = 256;
4 years ago
use std::borrow::Borrow;
pub mod iter;
use iter::*;
pub mod entry;
pub use entry::Entry;
4 years ago
pub mod space;
pub mod primitive;
pub use primitive::Primitive;
4 years ago
mod init;
4 years ago
mod private {
pub trait Sealed{}
}
/// A smallmap set.
///
/// Can be used to quickly insert or remove a key only, with no value; and can be used to see if this key is present.
///
/// Any map type with a zero-sized value is essentially a set.
pub type Set<T> = Map<T,()>;
/// A helper macro for creating `Map` instances with or without pre-set entries.
///
/// # Create empty map
/// With no parameters this just calls `Map::new()`.
/// ```
/// # use smallmap::*;
/// let map: Map<i32, i32> = smallmap!();
/// let map2: Map<i32, i32> = Map::new();
/// assert_eq!(map, map2);
/// ```
/// # Create with key-value pairs
/// You can specify some entries to pre-insert in the format `{key => value}`.
/// ```
/// # use smallmap::*;
/// let map = smallmap! {
/// {"Key" => 1},
/// {"Key two" => 2},
/// {"Key three" => 3},
/// {"Key four" => 4},
/// };
/// ```
#[macro_export ]macro_rules! smallmap {
() => {
$crate::Map::new()
};
($({$key:expr => $value:expr}),* $(,)?) => {
{
let mut map = $crate::Map::new();
$(
map.insert($key, $value);
)*
map
}
}
}
4 years ago
/// Trait for types that can be used as `Map` keys.
///
/// Implementors should try to minimise collisions by making `collapse` return a relatively unique value if possible.
/// But it is not required.
/// It is automatically implemented for types implementing the `Hash` trait.
4 years ago
/// A simple folding implementation is provided for byte slices here [`collapse_iter()`](collapse_iter).
///
/// The default implementation has integer types implement this through the modulo of itself over 256, whereas byte slice types implement it through an XOR fold over itself. It doesn't matter though, the programmer is free to implement it how she chooses.
4 years ago
pub trait Collapse: Eq
4 years ago
{
4 years ago
/// Create the index key for this instance. This is similar in use to `Hash::hash()`.
4 years ago
fn collapse(&self) -> u8;
}
4 years ago
/// A single page in a `Map`. Contains up to 256 key-value entries.
4 years ago
#[repr(transparent)]
pub struct Page<TKey,TValue>([Option<(TKey, TValue)>; MAX]);
4 years ago
mod page_impls;
4 years ago
4 years ago
impl<K,V> Page<K,V>
4 years ago
where K: Collapse
4 years ago
{
/// Create a new blank page
4 years ago
#[cfg(nightly)]
4 years ago
pub const fn new() -> Self
{
4 years ago
Self(init::blank_page())
4 years ago
}
4 years ago
/// Create a new blank page
#[cfg(not(nightly))]
pub fn new() -> Self
{
Self(init::blank_page())
}
4 years ago
/// The number of entries currently in this page
///
/// This is a count that iterates over all slots, if possible store it in a temporary instead of re-calling it many times.
4 years ago
pub fn len(&self) -> usize
{
self.0.iter().map(Option::as_ref).filter_map(std::convert::identity).count()
}
4 years ago
/// An iterator over all entries currently in this page
4 years ago
pub fn iter(&self) -> PageElements<'_, K,V>
{
PageElements(self.0.iter())
}
4 years ago
/// A mutable iterator over all entries currently in this page
4 years ago
pub fn iter_mut(&mut self) -> PageElementsMut<'_, K,V>
{
PageElementsMut(self.0.iter_mut())
}
4 years ago
4 years ago
fn search<Q: ?Sized>(&self, key: &Q) -> &Option<(K,V)>
4 years ago
where Q: Collapse
4 years ago
{
&self.0[usize::from(key.collapse())]
}
fn search_mut<Q: ?Sized>(&mut self, key: &Q) -> &mut Option<(K,V)>
4 years ago
where Q: Collapse
4 years ago
{
&mut self.0[usize::from(key.collapse())]
}
fn replace(&mut self, k: K, v: V) -> Option<(K,V)>
{
std::mem::replace(&mut self.0[usize::from(k.collapse())], Some((k,v)))
}
}
impl<K: Collapse, V> std::iter::FromIterator<(K, V)> for Map<K,V>
{
fn from_iter<I: IntoIterator<Item=(K, V)>>(iter: I) -> Self
{
//TODO: Optimise this
let mut this = Self::new();
for (key, value) in iter.into_iter()
{
this.insert(key, value);
}
this
}
}
4 years ago
impl<K,V> IntoIterator for Page<K,V>
4 years ago
where K: Collapse
4 years ago
{
type Item= (K,V);
type IntoIter = IntoPageElements<K,V>;
4 years ago
/// Consume this `Page` into an iterator of all values currently in it.
4 years ago
fn into_iter(self) -> Self::IntoIter
{
IntoPageElements(self.0, 0)
}
}
impl<K,V> Default for Page<K,V>
4 years ago
where K: Collapse
4 years ago
{
#[inline]
fn default() -> Self
{
Self::new()
}
}
4 years ago
/// A small hashtable-like map with byte sized key indecies.
4 years ago
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default)]
#[cfg_attr(feature="serde", derive(serde::Serialize, serde::Deserialize))]
4 years ago
pub struct Map<TKey, TValue>(Vec<Page<TKey,TValue>>);
impl<K,V> Map<K,V>
4 years ago
where K: Collapse
4 years ago
{
4 years ago
fn new_page(&mut self) -> &mut Page<K,V>
{
let len = self.0.len();
self.0.push(Page::new());
&mut self.0[len]
}
#[inline(always)] fn fuck_entry(&mut self, key: K) -> Option<Entry<'_, K, V>>
{
for page in self.0.iter_mut()
{
let re = page.search_mut(&key);
match re {
Some((ref ok, _)) if key.eq(ok.borrow()) => {
return Some(Entry::Occupied(entry::OccupiedEntry(re)));
},
None => {
return Some(Entry::Vacant(entry::VacantEntry(re, key)));
},
_ => (),
}
}
None
}
/// Get an `Entry` for the `key` that lets you get or insert the value
4 years ago
pub fn entry(&mut self, key: K) -> Entry<'_, K, V>
{
4 years ago
// somehow this is faster than using index, even though here we search twice????? i don't know why but there you go
if let None = self.0.iter()
.filter(|x| x.search(&key).as_ref().and_then(|(k, v)| if k==&key {None} else {Some((k,v))}).is_none())
.next() {
4 years ago
self.new_page();
4 years ago
}
4 years ago
self.fuck_entry(key).unwrap()
}
4 years ago
/// Remove all empty pages from this instance.
4 years ago
pub fn clean(&mut self)
{
4 years ago
#[cfg(nightly)]
4 years ago
self.0.drain_filter(|x| x.len() <1);
4 years ago
#[cfg(not(nightly))]
{
let mut i = 0;
while i != self.0.len() {
if self.0[i].len() <1 {
self.0.remove(i);
} else {
i += 1;
}
}
}
4 years ago
}
4 years ago
/// The number of entries currently in this map
///
/// This is an iterating count over all slots in all current pages, if possible store it in a temporary instead of re-calling it.
4 years ago
pub fn len(&self) -> usize
{
self.pages().map(Page::len).sum()
}
/// Is this map empty
pub fn is_empty(&self) -> bool
{
self.0[0].iter().next().is_none()
}
4 years ago
/// The number of pages currently in this map
4 years ago
pub fn num_pages(&self) -> usize
{
self.0.len()
}
4 years ago
/// Consume the instance, returning all pages.
4 years ago
pub fn into_pages(self) -> Vec<Page<K,V>>
{
self.0
}
4 years ago
/// An iterator over all pages
4 years ago
pub fn pages(&self) -> Pages<'_, K, V>
{
iter::Pages(self.0.iter())
}
4 years ago
/// A mutable iterator over all pages
4 years ago
pub fn pages_mut(&mut self) -> PagesMut<'_, K, V>
{
iter::PagesMut(self.0.iter_mut())
}
4 years ago
/// An iterator over all elements in the map
4 years ago
pub fn iter(&self) -> Iter<'_, K, V>
{
Iter(None, self.pages())
}
4 years ago
/// A mutable iterator over all elements in the map
4 years ago
pub fn iter_mut(&mut self) -> IterMut<'_, K, V>
{
IterMut(None, self.pages_mut())
}
4 years ago
/// Create a new empty `Map`
4 years ago
pub fn new() -> Self
{
Self(vec![Page::new()])
}
4 years ago
/// Create a new empty `Map` with a specific number of pages pre-allocated
4 years ago
pub fn with_capacity(pages: usize) -> Self
{
#[cold] fn cap_too_low() -> !
{
panic!("Got 0 capacity, this is invalid.")
}
4 years ago
if pages == 0 {
cap_too_low()
4 years ago
}
4 years ago
let mut p = Vec::with_capacity(pages);
p.push(Page::new());
Self(p)
}
4 years ago
/// Get a mutable reference of the value corresponding to this key if it is in the map.
4 years ago
pub fn get_mut<Q: ?Sized>(&mut self, key: &Q) -> Option<&mut V>
where K: Borrow<Q>,
4 years ago
Q: Collapse + Eq
4 years ago
{
for page in self.0.iter_mut()
{
match page.search_mut(key) {
Some((ref ok, ov)) if key.eq(ok.borrow()) => {
return Some(ov);
},
_ => (),
}
}
None
}
4 years ago
/// Search the map for entry corresponding to this key
4 years ago
#[inline] pub fn contains_key<Q: ?Sized>(&self, key: &Q) -> bool
where K: Borrow<Q>,
4 years ago
Q: Collapse + Eq
4 years ago
{
self.get(key).is_some()
}
4 years ago
/// Get a reference of the value corresponding to this key if it is in the map.
4 years ago
pub fn get<Q: ?Sized>(&self, key: &Q) -> Option<&V>
where K: Borrow<Q>,
4 years ago
Q: Collapse + Eq
4 years ago
{
for page in self.0.iter()
{
match page.search(key) {
Some((ref ok, ov)) if key.eq(ok.borrow()) => {
return Some(ov);
},
_ => (),
}
}
None
}
4 years ago
4 years ago
/// Remove the entry corresponding to this key in the map, returning the value if it was present
4 years ago
pub fn remove<Q: ?Sized>(&mut self, key: &Q) -> Option<V>
where K: Borrow<Q>,
4 years ago
Q: Collapse + Eq
4 years ago
{
for page in self.0.iter_mut()
{
let v = page.search_mut(key);
match v {
Some((ref ok, _)) if key.eq(ok.borrow()) => {
return v.take().map(|(_, v)| v);
},
_ => (),
}
}
None
}
4 years ago
4 years ago
/// Insert a new key-value entry into this map, returning the pervious value if it was present
4 years ago
pub fn insert(&mut self, key: K, value: V) -> Option<V>
{
for page in self.0.iter_mut()
{
match page.search_mut(&key) {
Some((ref ok, ov)) if ok.eq(&key) => {
return Some(std::mem::replace(ov, value));
},
empty @ None => {
return empty.replace((key, value))
.map(|(_, v)| v);
},
_ => (),
}
}
let mut page = Page::new();
page.replace(key, value);
self.0.push(page);
None
}
3 years ago
/// Consume this `Map` by swapping its keys and values around.
pub fn reverse(self) -> Map<V,K>
where V: Collapse
{
let mut output = Map::with_capacity(self.num_pages());
for (k,v) in self.into_iter()
{
output.insert(v, k);
}
output
}
4 years ago
}
4 years ago
impl<K: Collapse, V> IntoIterator for Map<K,V>
4 years ago
{
type Item= (K,V);
type IntoIter = IntoIter<K,V>;
4 years ago
/// Consume this map into an iterator over all currently inserted entries
4 years ago
fn into_iter(self) -> Self::IntoIter
{
IntoIter(None, self.0.into_iter())
}
}
impl<K: Collapse, V> std::iter::Extend<(K,V)> for Map<K,V>
{
fn extend<T: IntoIterator<Item = (K,V)>>(&mut self, iter: T) {
// we can probably optimise this better, right?
for (key, value) in iter.into_iter()
{
self.insert(key,value);
}
}
}
use std::hash::{Hash, Hasher,};
impl<T: Hash+ Eq> Collapse for T
4 years ago
{
fn collapse(&self) -> u8 {
struct CollapseHasher(u8);
macro_rules! hash_type {
($nm:ident, u8) => {
#[inline(always)] fn $nm(&mut self, i: u8)
{
self.0 ^= i;
}
};
($nm:ident, i8) => {
#[inline(always)] fn $nm(&mut self, i: i8)
{
self.0 ^= i as u8;
}
};
($nm:ident, $ty:tt) => {
#[inline] fn $nm(&mut self, i: $ty)
{
self.0 ^= (i % MAX as $ty) as u8;
}
};
}
impl Hasher for CollapseHasher
{
#[inline] fn finish(&self) -> u64
{
self.0 as u64
}
#[inline] fn write(&mut self, buffer: &[u8])
{
self.0 ^= collapse(buffer);
}
hash_type!(write_u8, u8);
hash_type!(write_i8, i8);
hash_type!(write_i16, i16);
hash_type!(write_u16, u16);
hash_type!(write_i32, i32);
hash_type!(write_u32, u32);
hash_type!(write_i64, i64);
hash_type!(write_u64, u64);
hash_type!(write_u128, u128);
hash_type!(write_isize, isize);
hash_type!(write_usize, usize);
}
let mut h = CollapseHasher(0);
self.hash(&mut h);
h.0
4 years ago
}
}
4 years ago
#[cfg(test)]
mod tests;
/// Collapse a slice of bytes with an XOR fold
#[inline] pub fn collapse<T: AsRef<[u8]>>(bytes: T) -> u8
4 years ago
{
bytes.as_ref().iter().copied().fold(0, |a, b| a ^ b)
}
4 years ago
/// Collapse an iterator of bytes with an XOR fold
#[inline] pub fn collapse_iter<T: IntoIterator<Item=u8>>(bytes: T) -> u8
{
bytes.into_iter().fold(0, |a, b| a ^ b)
}