You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since cppcodec aims for implementations of common encodings, URL encoding seems like a good candidate to add to the pool.
Implementing it will require a few changes to the codec template, because unlike base-N encodings, URL encoding results in variable length encoded strings (depending on the number of escaped characters).
That means the encoded_size() assertion about returning the exact number of encoded result bytes, without looking at the actual string, cannot hold anymore. The number of encoded bytes can be anywhere between the original size and three times original size. These should probably be represented by an encoded_min_size() and encoded_max_size(), but using either for allocation of result bytes is probably not too great of an idea as they're both likely off.
Having an unknown output size also means we won't be able to use the "pre-allocation & write to data pointer" optimization that mutable data pointer types make use of right now. Using push_back() instead of direct data access for variable-length encodings should be fixable with a bit of thought, though.
Maybe we can have something like an "average encoding ratio" for variable-length codecs like URL encoding, ideally defaulted by the library but customizable by the user. To preserve the same consistent encode()/decode() API that's used for base-N codecs, it might work better to make the average encoding ratio a template argument to the url codec class.
Given the difference in processing, a class structure entirely separate from base-N codecs might be necessary, similar to the planned baseN_num codecs which need a different encode()/decode() API as well.
The text was updated successfully, but these errors were encountered:
Since cppcodec aims for implementations of common encodings, URL encoding seems like a good candidate to add to the pool.
Implementing it will require a few changes to the
codec
template, because unlike base-N encodings, URL encoding results in variable length encoded strings (depending on the number of escaped characters).That means the
encoded_size()
assertion about returning the exact number of encoded result bytes, without looking at the actual string, cannot hold anymore. The number of encoded bytes can be anywhere between the original size and three times original size. These should probably be represented by anencoded_min_size()
andencoded_max_size()
, but using either for allocation of result bytes is probably not too great of an idea as they're both likely off.Having an unknown output size also means we won't be able to use the "pre-allocation & write to data pointer" optimization that mutable data pointer types make use of right now. Using
push_back()
instead of direct data access for variable-length encodings should be fixable with a bit of thought, though.Maybe we can have something like an "average encoding ratio" for variable-length codecs like URL encoding, ideally defaulted by the library but customizable by the user. To preserve the same consistent
encode()
/decode()
API that's used for base-N codecs, it might work better to make the average encoding ratio a template argument to theurl
codec class.Given the difference in processing, a class structure entirely separate from base-N codecs might be necessary, similar to the planned
baseN_num
codecs which need a differentencode()
/decode()
API as well.The text was updated successfully, but these errors were encountered: