# refset - non-owning `HashSet` A hash-set analogue that does not own its data. It can be used to "mark" items without the need to transfer ownership to the map # Example use case ```rust /// Process arguments while ignoring duplicates fn process_args(args: impl IntoIterator) { let mut same= HashRefSet::new(); for argument in args.into_iter() { if !same.insert(argument.as_str()) { // Already processed this input, ignore continue; } //do work... } } ``` # Serialisation support with `serde` crate `HashRefSet` and `HashType` both implement `Serialize` and `Deserialize` from the `serde` crate if the `serde` feature is enabled. By default it is not. # Hashing We use the SHA512 hashing algorithm for the implementation at present. I may implement the ability to choose different types, but as of now I think it is sufficient. # Drawbacks Since the item is not inserted itself, we cannot use `Eq` to double check there was not a hash collision. While the hashing algorithm used (Sha512) is extremely unlikely to produce collisions, especially for small data types, keep in mind that it is not infallible. ## Speed `HashRefSet` is significantly slower than `HashSet`, so `HashSet` should be preferred in most cases. Even when `Clone` is required to insert into `HashSet`, it can be ~10x faster for trivial data structures. `HashRefSet` should be used if `Clone` is either not an option, or `Clone` is a significantly heavy operation on the type you're inserting. | Benchmark | Tests | Result | |--------------------|-----------------------------------------------|-----------------| | owning_strings | Inserts `String` into `HashSet` by cloning | ~4,538 ns/iter | | non_owning_strings | Inserts `str` into `HashRefSet` by reference | ~48,271 ns/iter | | owning_ints | Inserts `u32` into `HashSet` by copy | ~937 ns/iter | | non_owning_ints | Inserts `&u32` into `HashRefSet` by reference | ~31,089 ns/iter | ## When to use over `HashSet` * The type you're inserting needs to be both in the set and moved elsewhere. (see exmaple) * Simply using `Clone` to insert a copy of the item into a `HashSet` is not possible (non-`Clone` type) or is a significantly heavy operation. (see benchmarks) * The fallibility of potential (albeing extremely unlikely) collisions of the SHA512 algorithm is not a concern * You need to insert an unsized type into a `HashSet` ## Smallmap implementation With the `smallmap` feature enabled, the `small` module also provides the same API as `HashRefSet` via `SmallRefMap`. It is backed by `smallmap::Map` instead of `HashSet`, which could potentially have some performance or memory usage impacts, or not. The hashing algorithm and usage is otherwise identical for now, but this may change. ### Benchmarks of `SmallRefMap` Comparing with cloning or copying into `smallmap::Map`. Largely there are the same performance penalties as the above table, with very minor differences. | Benchmark | Tests | Result | |--------------------|------------------------------------------------|-----------------| | owning_strings | Inserts `String` into `SmallMap` by cloning | ~3,096 ns/iter | | non_owning_strings | Inserts `str` into `SmallRefMap` by reference | ~47,302 ns/iter | | owning_ints | Inserts `u32` into `SmallMap` by copy | ~316 ns/iter | | non_owning_ints | Inserts `&u32` into `SmallRefMap` by reference | ~30,046 ns/iter | Each page of the `SmallRefMap` will consume at least 16kb of memory however. This may not be very desireable, but is still an available feature. # License MIT