Crate data_encoding
source · [−]Expand description
Efficient and customizable data-encoding functions like base64, base32, and hex
This crate provides little-endian ASCII base-conversion encodings for bases of size 2, 4, 8, 16, 32, and 64. It supports:
- padding for streaming
- canonical encodings (e.g. trailing bits are checked)
- in-place encoding and decoding functions
- partial decoding functions (e.g. for error recovery)
- character translation (e.g. for case-insensitivity)
- most and least significant bit-order
- ignoring characters when decoding (e.g. for skipping newlines)
- wrapping the output when encoding
- no-std environments with
default-features = false, features = ["alloc"]
- no-alloc environments with
default-features = false
You may use the binary or the website to play around.
Examples
This crate provides predefined encodings as constants. These constants are of type
Encoding
. This type provides encoding and decoding functions with in-place or allocating
variants. Here is an example using the allocating encoding function of BASE64
:
use data_encoding::BASE64;
assert_eq!(BASE64.encode(b"Hello world"), "SGVsbG8gd29ybGQ=");
Here is an example using the in-place decoding function of BASE32
:
use data_encoding::BASE32;
let input = b"JBSWY3DPEB3W64TMMQ======";
let mut output = vec![0; BASE32.decode_len(input.len()).unwrap()];
let len = BASE32.decode_mut(input, &mut output).unwrap();
assert_eq!(&output[0 .. len], b"Hello world");
You are not limited to the predefined encodings. You may define your own encodings (with the
same correctness and performance properties as the predefined ones) using the Specification
type:
use data_encoding::Specification;
let hex = {
let mut spec = Specification::new();
spec.symbols.push_str("0123456789abcdef");
spec.encoding().unwrap()
};
assert_eq!(hex.encode(b"hello"), "68656c6c6f");
You may use the macro library to define a compile-time custom encoding:
use data_encoding::Encoding;
use data_encoding_macro::new_encoding;
const HEX: Encoding = new_encoding!{
symbols: "0123456789abcdef",
translate_from: "ABCDEF",
translate_to: "abcdef",
};
const BASE64: Encoding = new_encoding!{
symbols: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",
padding: '=',
};
Properties
The HEXUPPER
, BASE32
, BASE32HEX
, BASE64
, and BASE64URL
predefined encodings
are conform to RFC4648.
In general, the encoding and decoding functions satisfy the following properties:
- They are deterministic: their output only depends on their input
- They have no side-effects: they do not modify a hidden mutable state
- They are correct: encoding then decoding gives the initial data
- They are canonical (unless
is_canonical
returns false): decoding then encoding gives the initial data
This last property is usually not satisfied by base64 implementations. This is a matter of choice and this crate has made the choice to let the user choose. Support for canonical encoding as described by the RFC is provided. But it is also possible to disable checking trailing bits, to add characters translation, to decode concatenated padded inputs, and to ignore some characters.
Since the RFC specifies the encoding function on all inputs and the decoding function on all
possible encoded outputs, the differences between implementations come from the decoding
function which may be more or less permissive. In this crate, the decoding function of canonical
encodings rejects all inputs that are not a possible output of the encoding function. Here are
some concrete examples of decoding differences between this crate, the base64
crate, and the
base64
GNU program:
Input | data-encoding | base64 | GNU base64 |
---|---|---|---|
AAB= | Trailing(2) | Last(2) | \x00\x00 |
AA\nB= | Length(4) | Length | \x00\x00 |
AAB | Length(0) | Last(2) | Invalid input |
AAA | Length(0) | [0, 0] | Invalid input |
A\rA\nB= | Length(4) | Byte(1) | Invalid input |
-_\r\n | Symbol(0) | Byte(0) | Invalid input |
AA==AA== | [0, 0] | Byte(2) | \x00\x00 |
We can summarize these discrepancies as follows:
Discrepancy | data-encoding | base64 | GNU base64 |
---|---|---|---|
Check trailing bits | Yes | Yes | No |
Ignored characters | None | None | \n |
Translated characters | None | None | None |
Check padding | Yes | No | Yes |
Support concatenated input | Yes | No | Yes |
This crate permits to disable checking trailing bits. It permits to ignore some characters. It permits to translate characters. It permits to use unpadded encodings. However, for padded encodings, support for concatenated inputs cannot be disabled. This is simply because it doesn’t make sense to use padding if it is not to support concatenated inputs.
Structs
Decoding error
Decoding error with partial result
Base-conversion encoding
Base-conversion specification
Specification error
How to translate characters when decoding
How to wrap the output when encoding
Enums
Order in which bits are read from a byte
Decoding error kind
Constants
Padded base32 encoding
Padded base32hex encoding
Unpadded base32hex encoding
DNSCurve base32 encoding
DNSSEC base32 encoding
Unpadded base32 encoding
Padded base64 encoding
Padded base64url encoding
Unpadded base64url encoding
MIME base64 encoding
Unpadded base64 encoding
Lowercase hexadecimal encoding
Lowercase hexadecimal encoding with case-insensitive decoding
Uppercase hexadecimal encoding
Uppercase hexadecimal encoding with case-insensitive decoding