Skip to content

Commit

Permalink
Reviewed type_hash; added Copy bound to ZeroCopy
Browse files Browse the repository at this point in the history
  • Loading branch information
vigna committed Nov 6, 2023
1 parent 75fc5c3 commit 469fd3b
Show file tree
Hide file tree
Showing 11 changed files with 29 additions and 25 deletions.
27 changes: 16 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

ε-serde is a Rust framework for *ε*-copy *ser*ialization and *de*serialization.

## WARNING

With release 0.2.0 we had to change the type hash function, so all serialized data
must be regenerated. We apologize for the inconvenience.

## Why

Large immutable data structures need time to be deserialized using the [serde](https://serde.rs/)
Expand All @@ -26,7 +31,7 @@ at deserialization time one can build quickly a proper Rust structure whose refe
memory, however, is not copied. We call this approach *ε-copy deserialization*, as
typically a minuscule fraction of the serialized data is copied to build the structure.
The result is similar to that of the frameworks above, but the performance of the
deserialized structure will be identical to that of a standard, in-memory
deserialized structure will be identical to that of a standard, in-memory
Rust structure, as references are resolved at deserialization time.

We provide procedural macros implementing serialization and deserialization methods,
Expand Down Expand Up @@ -135,9 +140,8 @@ memory-mapped region that supports it.
## Examples: ε-copy of standard structures

Zero-copy deserialization is not that interesting because it can be applied only to
data whose memory layout and size is fixed and known at compile time.
This time, let us serialize a `Vec` containing a
a thousand zeros: ε-serde will deserialize its associated
data whose memory layout and size are fixed and known at compile time.
This time, let us serialize a `Vec` containing a thousand zeros: ε-serde will deserialize its associated
deserialization type, which is a reference to a slice.
```rust
use epserde::prelude::*;
Expand Down Expand Up @@ -171,7 +175,7 @@ Note how we serialize a vector, but we deserialize a reference
to a slice; the same would happen when serializing a boxed slice.
The reference points inside `b`, so there is very little
copy performed (in fact, just a field containing the length of the slice).
All this is due to the fact that `usize` is a zero-copy type.
All this is because `usize` is a zero-copy type.
Note also that we use the convenience method [`Deserialize::load_full`](`deser::Deserialize::load_full`).

If your code must work both with the original and the deserialized
Expand All @@ -180,7 +184,7 @@ by both types, such as `AsRef<[usize]>`.

## Example: Zero-copy structures

You can define your own types to be zero-copy, in which case they will
You can define your types to be zero-copy, in which case they will
work like `usize` in the previous examples. This requires the structure
to be made of zero-copy fields, and to be annotated with `#[zero_copy]`
and `#[repr(C)]`:
Expand Down Expand Up @@ -221,7 +225,7 @@ let u: MemCase<&[Data]> =
<Vec<Data>>::mmap(&file, Flags::empty()).unwrap();
assert_eq!(s, **u);
```
If a structure is not zero-copy, vectors will be always deserialized to vectors.
If a structure is not zero-copy, vectors will be always deserialized into vectors.

## Example: Structures with parameters

Expand Down Expand Up @@ -366,8 +370,8 @@ upon deserialization;
There is no constraint on the associated deserialization type: it can be literally
anything. In general, however, one tries to have a deserialization type that is somewhat
compatible with the original type: for example, ε-serde deserializes vectors as
references to slices, so all mutation method that do not change the length work on both.
And in general [`ZeroCopy`](traits::ZeroCopy) types deserialize to themselves.
references to slices, so all mutation methods that do not change the length work on both.
And in general [`ZeroCopy`](traits::ZeroCopy) types deserialize into themselves.

Being [`ZeroCopy`](traits::ZeroCopy) or [`DeepCopy`](traits::DeepCopy) decides
instead how the type will be treated
Expand Down Expand Up @@ -402,12 +406,13 @@ The basic idea in ε-serde is that *if a field has a type that is a parameter, d
the deserialization type is defined recursively, replacement can happen at any depth level. For example,
a field of type `A = Vec<Vec<Vec<usize>>>` will be deserialized as a `A = Vec<Vec<&[usize]>>`.

This approach makes it possible to write ε-serde-aware structures that hide completely
This approach makes it possible to write ε-serde-aware structures that hides completely
from the user the substitution. A good example
is the `CompactArray` structure from [`sux-rs`](http://crates.io/sux/), which exposes an array of fields of fixed
bit width using (usually) a `Vec<usize>` as backend. If you have your own struct and one
of the fields is of type `A`, when serializing your struct with `A` equal to `CompactArray<Vec<usize>>`,
upon ε-copy deserialization you will get a version of your struct with `CompactArray<&[usize]>`. All this will happen under the hood because `CompactArray` is ε-serde-aware, and in fact you will not
upon ε-copy deserialization you will get a version of your struct with `CompactArray<&[usize]>`.
All this will happen under the hood because `CompactArray` is ε-serde-aware, and in fact you will not
even notice the difference, because you will access the same methods of `CompactArray` before
and after.

Expand Down
2 changes: 1 addition & 1 deletion epserde/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ mmap-rs = {version="0.5.0", optional=true}
bitflags = {version="2.3.1", default-features=false}
bytemuck = "1.14.0"
xxhash-rust = {version="0.8.5", default-features=false, features=["xxh3"]}
epserde-derive = {path="../epserde-derive", optional = true} #{ version = "=0.1.2", optional = true }
epserde-derive = { version = "=0.2.0", optional = true }
anyhow = "1.0.75"

[features]
Expand Down
2 changes: 1 addition & 1 deletion epserde/examples/array_inner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

use epserde::prelude::*;

#[derive(Epserde, Debug)]
#[derive(Epserde, Copy, Clone, Debug)]
#[repr(C)]
#[zero_copy]
struct Data {
Expand Down
2 changes: 1 addition & 1 deletion epserde/examples/newtype_zero_copy.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
*/
use epserde::prelude::*;

#[derive(Epserde, Debug, PartialEq, Eq, Default, Clone)]
#[derive(Epserde, Copy, Debug, PartialEq, Eq, Default, Clone)]
#[repr(C)]
#[zero_copy]
struct USize {
Expand Down
2 changes: 1 addition & 1 deletion epserde/examples/repr_c.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ struct Object<A> {
}

#[repr(C)]
#[derive(Epserde, Debug, PartialEq, Eq, Default, Clone)]
#[derive(Epserde, Debug, PartialEq, Eq, Default, Clone, Copy)]
// We want to use zero-copy deserialization on Point,
// and thus ε-copy deserialization on Vec<Point>, etc.
#[zero_copy]
Expand Down
2 changes: 1 addition & 1 deletion epserde/examples/test_vec_struct.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

use epserde::prelude::*;

#[derive(Epserde, Debug, PartialEq, Eq, Default, Clone)]
#[derive(Epserde, Debug, PartialEq, Eq, Default, Clone, Copy)]
#[repr(C)]
#[zero_copy]
struct Data {
Expand Down
2 changes: 1 addition & 1 deletion epserde/src/impls/boxed_slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ impl<T> CopyType for Box<[T]> {

impl<T: TypeHash> TypeHash for Box<[T]> {
fn type_hash(hasher: &mut impl core::hash::Hasher) {
"[]".hash(hasher);
"Box<[]>".hash(hasher);
T::type_hash(hasher);
}
}
Expand Down
3 changes: 1 addition & 2 deletions epserde/src/impls/prim.rs
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,8 @@ impl<T: ?Sized> CopyType for PhantomData<T> {
impl<T: ?Sized + TypeHash> TypeHash for PhantomData<T> {
#[inline(always)]
fn type_hash(hasher: &mut impl core::hash::Hasher) {
"PhantomData<".hash(hasher);
"PhantomData".hash(hasher);
T::type_hash(hasher);
">".hash(hasher);
}
}

Expand Down
3 changes: 1 addition & 2 deletions epserde/src/impls/slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,8 @@ use std::hash::Hash;
impl<T: TypeHash> TypeHash for [T] {
#[inline(always)]
fn type_hash(hasher: &mut impl core::hash::Hasher) {
"[".hash(hasher);
"[]".hash(hasher);
T::type_hash(hasher);
"]".hash(hasher);
}
}

Expand Down
4 changes: 2 additions & 2 deletions epserde/src/traits/copy_type.rs
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ pub trait CopyType: Sized {

/// Marker trait for zero-copy types. You should never implement
/// this trait directly, but rather implement [`CopyType`] with `Copy=Zero`.
pub trait ZeroCopy: CopyType<Copy = Zero> + MaxSizeOf {}
impl<T: CopyType<Copy = Zero> + MaxSizeOf> ZeroCopy for T {}
pub trait ZeroCopy: CopyType<Copy = Zero> + Copy + MaxSizeOf {}
impl<T: CopyType<Copy = Zero> + Copy + MaxSizeOf> ZeroCopy for T {}

/// Marker trait for deep-copy types. You should never implement
/// this trait directly, but rather implement [`CopyType`] with `Copy=Deep`.
Expand Down
5 changes: 3 additions & 2 deletions epserde/tests/test_slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,11 @@ struct Data<A: PartialEq = usize, const Q: usize = 3> {
}

#[test]
fn test_box_slice_usize() -> Result<()> {
fn test_cheaty_serialize() -> Result<()> {
let a = vec![1, 2, 3, 4];
let s = a.as_slice();
let mut cursor = epserde::new_aligned_cursor();
a.serialize(&mut cursor)?;
s.serialize(&mut cursor)?;
cursor.set_position(0);
let b = <Vec<i32>>::deserialize_full(&mut cursor)?;
assert_eq!(a, b.as_slice());
Expand Down

0 comments on commit 469fd3b

Please sign in to comment.