refset/README.md

# refset - non-owning `HashSet`

A hash-set analogue that does not own its data.

It can be used to "mark" items without the need to transfer ownership to the map

# Example use case
 ```rust
 /// Process arguments while ignoring duplicates
 fn process_args(args: impl IntoIterator<Item=String>) {
   let mut same=  HashRefSet::new();
   for argument in args.into_iter()
   {
     if !same.insert(argument.as_str()) {
       // Already processed this input, ignore
       continue;
     }
     //do work...
   }
 }
 ```
 # Serialisation support with `serde` crate
 `HashRefSet` and `HashType` both implement `Serialize` and `Deserialize` from the `serde` crate if the `serde` feature is enabled. By default it is not.

 # Hashing
 We use the SHA512 hashing algorithm for the implementation at present.
 I may implement the ability to choose different types, but as of now I think it is sufficient.

 # Drawbacks
 Since the item is not inserted itself, we cannot use `Eq` to double check there was not a hash collision.
 While the hashing algorithm used (Sha512) is extremely unlikely to produce collisions, especially for small data types, keep in mind that it is not infallible.

## Speed
`HashRefSet` is significantly slower than `HashSet`, so `HashSet` should be preferred in most cases.
Even when `Clone` is required to insert into `HashSet`, it can be ~10x faster for trivial data structures.
`HashRefSet` should be used if `Clone` is either not an option, or `Clone` is a significantly heavy operation on the type you're inserting.

| Benchmark          | Tests                                         | Result          |
|--------------------|-----------------------------------------------|-----------------|
| owning_strings     | Inserts `String` into `HashSet` by cloning    | ~4,538 ns/iter  |
| non_owning_strings | Inserts `str` into `HashRefSet` by reference  | ~48,271 ns/iter |
| owning_ints        | Inserts `u32` into `HashSet` by copy          | ~937 ns/iter    |
| non_owning_ints    | Inserts `&u32` into `HashRefSet` by reference | ~31,089 ns/iter |

## When to use over `HashSet`
* The type you're inserting needs to be both in the set and moved elsewhere. (see exmaple)
* Simply using `Clone` to insert a copy of the item into a `HashSet` is not possible (non-`Clone` type) or is a significantly heavy operation. (see benchmarks)
* The fallibility of potential (albeing extremely unlikely) collisions of the SHA512 algorithm is not a concern
* You need to insert an unsized type into a `HashSet`

## Smallmap implementation
With the `smallmap` feature enabled, the `small` module also provides the same API as `HashRefSet` via `SmallRefMap`.
It is backed by `smallmap::Map` instead of `HashSet`, which could potentially have some performance or memory usage impacts, or not.

The hashing algorithm and usage is otherwise identical for now, but this may change.

### Benchmarks of `SmallRefMap`
Comparing with cloning or copying into `smallmap::Map`.

Largely there are the same performance penalties as the above table, with very minor differences.

| Benchmark          | Tests                                          | Result          |
|--------------------|------------------------------------------------|-----------------|
| owning_strings     | Inserts `String` into `SmallMap` by cloning    | ~3,096 ns/iter  |
| non_owning_strings | Inserts `str` into `SmallRefMap` by reference  | ~47,302 ns/iter |
| owning_ints        | Inserts `u32` into `SmallMap` by copy          | ~316 ns/iter    |
| non_owning_ints    | Inserts `&u32` into `SmallRefMap` by reference | ~30,046 ns/iter |

Each page of the `SmallRefMap` will consume at least 16kb of memory however.
This may not be very desireable, but is still an available feature.

# License
MIT