Support for unsigned integer values #99

bmatsuo · 2017-02-07T05:39:39Z

Closes #11

I am opening this now because it is close to done. But it is probably not done right now. It is a rather large change though. So I am letting it bake a here for a little bit.

The one thing that differs from the discussion in #11 is that the interface here is called Data, not Value. The word "value" was too overloaded already. Data is pretty precise, and shorter by one character. So I have opted to use it.

edit: The only unfortunate thing about the use of the word Data is its appearance near the beginning of the function index generated by go doc. This is fairly minor, but it does bug me with the number of conversion functions I had to define for working with the output of Txn.GetData and Cursor.GetData.

edit: I have actually stopped using an interface. The base code is the same and I can reintroduce an interface if needed in the future. But the API felt unwieldy and even without the new interfaces the API change is not negligible. But I think the result is quite clean now, and it's a few ns faster than the version that used interfaces.

The intial benchmark results are not impressive. Writing a bunch of size_t values doesn't appear to be significantly faster than encoding uint64 values in big-endian. And, when optmized to eliminate allocations big-endian byte slices have consistently better performance characteristics.

It doesn't seem like a good idea to expose the String function because there may be a StringValue implementation of Value in the future. So this type either needs to be implemented premptively as a stub or omitted for the time being.

This will provide better grouping of topics in godoc. Uint stuff will all be together and Uintptr stuff will all be together. The byte-related Value stuff won't really be together, but that is a tragedy of evolution. Maybe for lmdb-go 2.0 this part of the Value and FixedPage interfaces can be made consistent for all data types.

Instead using a method named Put, the name Append is used to more closely match what it is doing. This also more transparently exposes slice appending semantics and its quirks (which were already there).

There is some additional overhead processing integer Multi variants but I couldn't trust users if I were to make the types raw byte slices.

Using the word Value so much was really confusing because LMDB is a key-value store and the word "value" already has a reserved meaning.

This is where the Data interface is defined as well as the generic BytesData type. This is also where the FixedPage interface is defined because it seems like the most appropriate place, though the generic Multi type has been left if val.go, mostly for historical reasons. It may be moved later.

It's not really clear how those functions should behave and the data conversion functions provided should provide more help overall.

The are faster.

The convention in go is to differentiate functions based on their type using a name suffix. The previous names did not meet this convention and it inconsistent where other functions in the package would follow convention.

It turns out I have been using the official terminology backwards. LMDB associates keys with data. Keys and data are both values (this was a bit of a facepalm given the type names).

Fixes #103 This will effectively deprecate Cursor.PutMulti once merged. It can survive for a while longer. But PutMultiple is a bit simpler if somewhat slower. The overhead of interfaces should be dominated by the actual work of inserting items into the database in practice.

I am still working back from a spot where there were way too many panics. A new type, Stride, has been introduced with a method Stride.Multiple which has the same signature as MultipleCUint and MultipleCSizet.

These methods can be useful occasionally. Examples were added to demonstrate when slice syntax can be used instead. Also, this change renames the receiver in CUintValue and CSizetValue methods to be u and z resectively. This will help drive the naming convention that the Value* convertion functions adopted.

bmatsuo added 30 commits March 7, 2016 16:54

lmdb: define a Value interface that will be used to write data

02217ba

lmdb: add benchmarks for storing Uint values

d2bb4ac

lmdb: Txn.GetValue() and Cursor.GetValue() which accept Value arguments

1cb589c

lmdb: benchmarks for GetValue() in the same vein as PutValue()

e77a236

lmdb: fix PutValue benchmarks

5a93cf5

lmdb: GetValue and PutValue functions fall back to []byte counterparts

67f0a6a

Merge branch 'master' into bmatsuo/value-interface

349e708

lmdb: add RawRead variants of the GetValue benchmarks

8318115

Merge branch 'master' into bmatsuo/value-interface

f267f0f

Merge branch 'master' into bmatsuo/value-interface

f58f2a9

Merge branch 'master' into bmatsuo/value-interface

8ede2e7

lmdb: Guard uint overflow detection and add MultiUint[ptr] types

ac0ca5a

lmdb: Cleaner integer Value implementations

6a97330

lmdb: Use fewer interfaces -- expose UintValue and UintptrValue directly

f31f925

lmdb: Refine implementations for UintMulti and UintptrMulti

3811e47

Instead using a method named Put, the name Append is used to more closely match what it is doing. This also more transparently exposes slice appending semantics and its quirks (which were already there).

lmdb: Remove commented code from the first Value implementation attempt

79afa63

lmdb: Make Value-based APIs more complete

523d82f

lmdbscan: Add support for lmdb.Value types with methods Set[Next]Value

24c5942

lmdb: Rename element accessors for UintMulti and UintptrMulti. Tests

bf67c1d

lmdb: Add constant UintMax to help checking Uint values

f6fca22

lmdb: Fix Multi.Append and make *Multi implementations consistent

7e2e22f

There is some additional overhead processing integer Multi variants but I couldn't trust users if I were to make the types raw byte slices.

lmdb: Rename Value interface to Data and rename related types/functions

d056a4d

Using the word Value so much was really confusing because LMDB is a key-value store and the word "value" already has a reserved meaning.

all: Get tests passing after changes to the Data API

8529146

lmdb: Data conversion functions -- hide GetUint[ptr] functions

2afd0e9

It's not really clear how those functions should behave and the data conversion functions provided should provide more help overall.

lmdbscan: Add method Scanner.Item to ease integer conversions

60339fb

docs: Top-level discussion of IntegerKey and IntegerDup in lmdb godoc

57c729a

bmatsuo added 8 commits February 11, 2017 02:24

lmdb: Add internal note about the Data interface

0df256c

lmdb: Run benchmarks where interfaces are bypassed completely

c1fd3d8

The are faster.

lmdb: Bypass interfaces for the C.size_t values as well

6c96219

lmdb: Remove Data interface and related functions

aadc434

lmdb: Clean interfaces up a bit

9bcff9d

lmdb: Clean up implementation of Data* functions

8acf1fb

lmdb: Make Multi type names more idiomatic

73067cc

The convention in go is to differentiate functions based on their type using a name suffix. The previous names did not meet this convention and it inconsistent where other functions in the package would follow convention.

lmdbscan: Remove Set[Next]Data functions after removal of Data interface

3dd1fb5

bmatsuo changed the title ~~Data interface~~ Support for unsigned integer data Feb 11, 2017

lmdb: Add benchmarks for explicitly writing 32-bit unsigned data

0133151

bmatsuo mentioned this pull request Feb 12, 2017

PutMulti does not return the number of items written #103

Open

bmatsuo added 4 commits February 12, 2017 08:02

Merge branch 'master' into bmatsuo/value-interface

721473f

lmdb: Rename data.go to convert.go

a4cdf3e

lmdb: Rename Data to Value again... ugh

80fbc2e

It turns out I have been using the official terminology backwards. LMDB associates keys with data. Keys and data are both values (this was a bit of a facepalm given the type names).

bmatsuo changed the title ~~Support for unsigned integer data~~ Support for unsigned integer values Feb 12, 2017

lmdb: Rename arguments so they are more consistent key/data

9f15f38

bmatsuo force-pushed the bmatsuo/value-interface branch from 6eeba8b to 9f15f38 Compare February 12, 2017 17:04

bmatsuo added 2 commits February 12, 2017 10:38

lmdb: Multi wrappers produce error values to reduce possible panics

69b88e4

I am still working back from a spot where there were way too many panics. A new type, Stride, has been introduced with a method Stride.Multiple which has the same signature as MultipleCUint and MultipleCSizet.

Merge branch 'master' into bmatsuo/value-interface

7ff2a98

bmatsuo mentioned this pull request Feb 13, 2017

PutMulti incorrectly panics when passed an empty page #106

Open

bmatsuo added 4 commits February 14, 2017 00:29

docs: indicate that Cursor.PutMulti has a bug if passed invalid values

2e3b8e7

Merge branch 'master' into bmatsuo/value-interface

0caab90

Merge branch 'master' into bmatsuo/value-interface

1aba5ce

bmatsuo mentioned this pull request Feb 15, 2017

lmdb: Support integer flags (MDB_INTEGERKEY/MDB_INTEGERDUP) #11

Open

1 task

bmatsuo added 4 commits February 20, 2017 16:01

lmdb: Uncomment integer example and update calls to the current API

fa49d9e

lmdb: Fix comments referencing old API funcs/types in integers example

2af0755

docs: minor godoc cleanup for integer values

314cbed

docs: clarification about the performance benefits of integer values

79596bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for unsigned integer values #99

Support for unsigned integer values #99

bmatsuo commented Feb 7, 2017 •

edited

Loading

Support for unsigned integer values #99

Are you sure you want to change the base?

Support for unsigned integer values #99

Conversation

bmatsuo commented Feb 7, 2017 • edited Loading

bmatsuo commented Feb 7, 2017 •

edited

Loading