You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

258 lines
19 KiB

* libkhash - kana-hash
Kana mnemonic hashes
** Example output
Input is =uguu~= using the default salt.
| Algorithm | Output |
|--------------------+------------------------------------------------------------------|
| SHA256 | おシソまツアでぅせヅモァだゅノぴヲろヂォセづマふげぁユねハァがゅ |
| CRC32 | わほヂァ |
| CRC64 | づやワえほぢレご |
| SHA256 (truncated) | おシソまツアでぅ |
** Installation
The dynamic library is built with ~Cargo~ and ~Rust~, and the CLI example program is built with ~gcc~.
*** Build and install
The default build configuration builds both the dynamic library, static library, and the CLI example program and copies them to =/usr/local/lib/libkhash.so=, =/usr/local/lib/libkhash.a= and =/usr/local/bin/kana-hash= respectively. Also installed is the C header to =/usr/local/include/khash.h=.
#+BEGIN_SRC shell
$ make && sudo make install
#+END_SRC
The install path can be changed by editing the ~INSTALL~, ~INSTALL-BIN~ and ~INSTALL-INCLUDE~ paths in the [[file:./Makefile][Makefile]].
*** Uninstall
To remove installed binaries, run:
#+BEGIN_SRC shell
$ sudo make uninstall
#+END_SRC
*** Other build configurations
The Makefile contains some other build directives.
**** Native code optimisations
By default =libkhash= builds the shared and static library with native architecture optimisations enabled.
If you are intending to move the libraries to another architecture, this might not be desireable.
To build without such optimisations, run:
#+BEGIN_SRC shell
$ make khash-nonative
#+END_SRC
**** Tests
To build and run all tests, run:
#+BEGIN_SRC shell
$ make test
#+END_SRC
**** Building the CLI
The default =make= directive builds both the library and the CLI example program.
To build just the CLI example program, run:
#+BEGIN_SRC shell
$ cd cli && make
#+END_SRC
** TODO Rust crate
This library is written in Rust and has a Rust library target. See Rustdocs for details
** C header
A header file is provided for C programs wanting to use the khash interface.
Documented more fully in [[file:./include/khash.h][./include/khash.h]].
All symbols defined here begin with either =KHASH_= (for macros) or =khash_=.
*** Example
To create a context
#+BEGIN_SRC c
#include <khash.h>
const char* input_salt = "salt!";
const char* input_data = "some data to hash".
khash_context ctx;
assert(khash_new_context(KHASH_ALGO_SHA256, KHASH_SALT_TYPE_SPECIFIC, input_salt, strlen(input_salt), &ctx) == KHASH_SUCCESS, "khash_new_context() failed.");
#+END_SRC
Find the buffer length we need and allocate a buffer.
#+BEGIN_SRC c
size_t length;
assert(khash_length(&ctx, input_data, strlen(input_data), &length) == KHASH_SUCCESS, "khash_length() failed.");
#+END_SRC
Create the buffer and hash, then print the result to ~stdout~.
#+BEGIN_SRC c
char* buffer = alloca(length+1);
assert(khash_do(&ctx, input_data, strlen(input_data), buffer, length) == KHASH_SUCCESS, "khash_do() failed.");
buffer[length+1] = 0; // Ensure we have a NUL terminator.
setlocale(LC_ALL, ""); //Ensure we can print UTF-8.
printf("Kana hash: %s\n", buffer);
#+END_SRC
Alternatively, we can allocate the max length needed instead of calculating it.
#+BEGIN_SRC c
size_t length;
assert(khash_max_length(KHASH_ALGO_SHA256, strlen(input_data), &length) == KHASH_SUCCESS, "khash_max_length() failed.");
char* buffer = alloca(length+1);
memset(buffer, 0, length+1); //Ensure NUL terminators.
#+END_SRC
*** Definitions
**** Macros
All macros defined are for options.
They cannot be combied as flags.
The =KHASH_ALGO_= prefixed ones are for use as the /algo/ parameter in the ~khash_new_context()~ function.
The =KHASH_SALT_TYPE_= prefixed ones are for use as the /salt_type/ parameter.
The =KHASH_ERROR_= prefixed ones each indicate an error code returned by all of the functions.
| Name | Description |
|-------------------------------+--------------------------------------------------------------------------------------------|
| ~KHASH_ALGO_DEFAULT~ | The default algorithm used by the library (truncated SHA256) |
| ~KHASH_ALGO_CRC32~ | CRC32 checksum algorithm |
| ~KHASH_ALGO_CRC64~ | CRC64 checksum algorithm |
| ~KHASH_ALGO_SHA256~ | SHA256 hash algorithm |
| ~KHSAH_ALGO_SHA256_TRUNCATED~ | SHA256 truncated to 64-bits |
| ~KHASH_SALT_TYPE_NONE~ | No salt |
| ~KHASH_SALT_TYPE_DEFAULT~ | The default static salt used by the library |
| ~KHASH_SALT_TYPE_SPECIFIC~ | A provided salt, as the /data/ and of the /size/ parameter passed to ~khash_new_context()~ |
| ~KHASH_SALT_TYPE_RANDOM~ | A randomly generated salt |
| ~KHASH_SUCCESS~ | The code returned by all of the functions when the operation was successful |
| ~KHASH_ERROR_IO~ | There was an IO error |
| ~KHASH_ERROR_FORMAT~ | The was a text formatting related error |
| ~KHASH_ERROR_LENGTH~ | There was a hash length mismatch |
| ~KHASH_ERROR_RNG~ | The random number generator failed |
| ~KHASH_ERROR_UNKNOWN~ | There was an unknown error or the stack attempted to unwind past the FFI boundary. |
**** Types
There are 2 exported structs, although you will rarely need to access their members directly.
| Name | Field | Description |
|-----------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------|
| ~khash_salt~ | | A salt allocated into a context by ~khash_new_context()~ and released by ~khash_free_context()~. You shouldn't mess with its field directly. |
| | /salt_type/ | The type of the salt. |
| | /size/ | The size of the salt. |
| | /body/ | A pointer to the body of the salt. (The memory allocated here is not guaranteed to be of the provided /size/.) |
|-----------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------|
| ~khash_context~ | | A context for the =khash_= functions. Allocated by ~khash_new_context()~. You can modify its fields if you want. |
| | /algo/ | The algorithm for this context. |
| | /flags/ | Placeholder for potential flags added in the future. Currently unused. |
| | /salt/ | The allocated salt. You shouldn't directly mess with this field. |
**** Functions
All defined functions return either ~KHASH_SUCCESS~ or one of the =KHASH_ERROR_= values [[Macros][above]].
| Name | Parameters | Description |
|-----------------------+------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ~khash_new_context~ | /algo/, /salt_type/, /data/, /size/, /output/ | Creates a new context for use with other =libkhash= functions. /algo/ is expected to be one of the =KHASH_ALGO_= macros listed [[Macros][above.]] Likewise /salt_type/ is expected to be one of the =KHASH_SALT_TYPE_= macros. /data/ can be ~NULL~ unless /salt_type/ is set to ~KHASH_SALT_TYPE_SPECIFIC~, in which exactly /size/ bytes are read from /data/. /output/ is expected to be a valid pointer to a currently unused `khash_context` structure. |
| ~khash_free_context~ | /ctx/ | Free a context allocated with ~khash_new_context()~. /ctx/ is expected to be a valid pointer to a currently allocated context. |
| ~khash_clone_context~ | /src/, /dst/ | Clone a context allocated with ~khash_new_context()~ into another. The newly allocated /dst/ must be properly released (with ~khash_free_context()~ or ~khash_do()~) as well as the source. /src/ is expected to be a valid pointer to an allocated context, and /dst/ is expected to be a valid pointer to an unallocated context. |
| ~khash_length~ | /ctx/, /data/, /size/, /length/ | Compute the length required to hold the output string for ~khash_do()~ for a given input. Will read exactly /size/ bytes from /data/ and compute the value into what is pointed to by /length/ (which is expected to be a valid pointer to a type of ~size_t~.) The resulting length does not include a =NUL= terminator for the string. |
| ~khash_do~ | /ctx/, /data/, /size/, /output/, /output_size/ | Compute the kana-hash of /size/ bytes from /data/ and store no more than /output_size/ of the the result into the string pointed to by /output/. Each pointer is expected to be valid. This function frees the supplied /ctx/ after the hash has been computed, and thus /ctx/ is no longer valid afterwards. |
| ~khash_max_length~ | /algo/, /input_len/, /output_len/ | Calculate the max possible size for the given algorithm (expected to be one of the =KHASH_ALGO_= macros) and input length, and store this result in /output_len/ (expected to be a valid non-~NULL~ pointer.) /input_len/ is not required unless the algorithm is dynamically sized (all currently implemented ones are not.) |
** Node FFI bindings
NPM package in [[file:./node/index.js][./node]]
*** Installation (npm)
Follow the [[installation]] section first.
#+BEGIN_SRC shell
$ npm install --save /path/to/repo/node
#+END_SRC
*** Examples
**** Import the package
#+BEGIN_SRC javascript
const hash = require('kana-hash');
#+END_SRC
**** Create a context
Create the context by specifying an algorithm identifier, and an optional salt.
If provided, the salt must be of type ~Salt~.
#+BEGIN_SRC javascript
const ctx = new hash.Kana(hash.Kana.ALGO_DEFAULT, new hash.Salt("optional salt~"));
#+END_SRC
**** Create a hash
The ~once()~ function consumes the context and outputs a hash string.
#+BEGIN_SRC javascript
const output = ctx.once("input string");
#+END_SRC
***** Creating a hash without consuming
If you want to reuse the context, use the ~hash()~ function.
#+BEGIN_SRC javascript
const output = ctx.hash("input string");
#+END_SRC
***** Freeing the context
The context must be release after use if you have not called ~once()~.
#+BEGIN_SRC javascript
ctx.finish();
#+END_SRC
***** Cloning an existing context
The new context must also be freed with either ~once()~ or ~finish()~.
#+BEGIN_SRC javascript
const new_ctx = ctx.clone();
#+END_SRC
**** Alternatively
To create a hash in one line you can do one of the following.
#+BEGIN_SRC javascript
const hash1 = new Kana(Kana.ALGO_DEFAULT, new Salt("salt~")).once("input string~"); //Allocates the exact space required for the output string.
const hash2 = Kana.single(Kana.ALGO_DEFAULT, new Salt("salt~"), "input string~"); //Allocates the max space required for the output string, instead of the exact. Might be faster.
#+END_SRC
*** Interface documentation
The 2 exported objects are ~Kana~ and ~Salt~. ~Kana~'s constructor expects between 0 and 2 arguments. + The first is either an [[Defined constants][algorithm definition]] or empty, if empty ~Kana~ uses the default algorithm (truncated SHA256).
+ The second is either an instance of ~Salt~ or empty, if empty ~Kana~ uses the default library salt. ~Salt~'s constructor expects 0 or 1 argument. + Either a string to use as the specific salt or empty, if empty there is no salt.
~Kana~ also has a static function ~single(algo, salt, input)~ which automatically creates a context, computes the hash, frees the context, and then returns the computed hash as a JavaScript string.
**** Defined constants
| Name | Type | Description |
|------------------------------+----------------------+--------------------------------------------------------------------------|
| ~Kana.ALGO_DEFAULT~ | Algorithm definition | The default algorithm specified by the library (set to sha256 truncated) |
| ~Kana.ALGO_CRC32~ | Algorithm definition | CRC32 checksum algorithm |
| ~Kana.ALGO_CRC64~ | Algorithm definition | CRC64 checksum algorithm |
| ~Kana.ALGO_SHA256~ | Algorithm definition | SHA256 hashing algorithm |
| ~Kana.ALGO_SHA256_TRUNCATED~ | Algorithm definition | Truncated SHA256 algorithm, to 64-bits |
| ~Salt.None~ | Salt | No salt |
| ~Salt.Random~ | Salt | A cryptographically secure random salt |
| ~Salt.Default~ | Salt | The library's default static salt |
** Notes
The strings generated by this library are meant to be pretty, not secure. It is not a secure way of representing a hash as many collisions are possible.
*** Digest algorithm
The kana algorithm is a 16-bit block digest that works as follows:
- The most and least significant 8 bits are each seperated into /Stage 0/ and /Stage 1/ each operating on the first and second byte respectively.
- Stage 0:
1. The byte is sign tested (bitwise ~AND~ =0x80=), store this as a boolean in /sign0/ (Negative becomes =1=, positive becomes =0=.)
2. The valid first character range is looked up using the result of the sign test (either 0 or 1), store the range in /range/, and the slice ~KANA~ taken from the range in /kana/.
3. The first index is calculated as the unsigned first byte modulo the size (exclusive) of /range/. Store this as /index/.
4. Compute the value of the first byte bitwise ~XOR~ the second byte, store this as /index1/.
5. The swap table is checked to see if /index/ + start of /range/ has an entry. Then each following step is checked in order: + If the swap entry exists and /index1/ bitwise ~AND~ =0x2= is =0=, set the first character of the output to the value found in the swap table.
+ If the swap entry exists and /index1/ bitwise ~AND~ =0x8= is =0= and /index/ + start of /range/ has an entry in the 2nd swap table, set the first character of the output to the value found in the 2nd swap table.
+ In any other case, set the first character of the output to the value found in the /kana/ slice at /index/.
- Stage 1:
1. Compute a sub table for /index/ plus the start of /range/ using the ranges defined in ~KANA_SUB_VALID_FOR~ and store it in /sub/. If there is no sub table possible, skip to step 3.
2. If there is an entry in /sub/ for the index of the 2nd byte modulo the size of ~KANA_SUB~, set the second output character to be that character.
3. If there was no value set from the sub table, the 2nd output character becomes the first output character from inputting the 2nd byte back through /Stage 0/ as the first byte.
- Concatenate both characters and move to the next 16-bit block.
Notes:
- It is valid for a single iterator to produce between 0 and 2 characters but no more.
- If an input given to the algorithm that cannot be divided exactly into 16-bit blocks (i.e. one byte is left over), a padding byte of 0 is added as the 2nd byte to make it fit.
For more information see [[file:./src/mnemonic.rs][mnemonic.rs]] and [[file:./src/map.rs][map.rs]].
** License
GPL'd with love <3