Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add url codec, a.k.a. urlencode/urldecode #7

Open
jpetso opened this issue Jan 17, 2016 · 1 comment
Open

Add url codec, a.k.a. urlencode/urldecode #7

jpetso opened this issue Jan 17, 2016 · 1 comment

Comments

@jpetso
Copy link
Collaborator

jpetso commented Jan 17, 2016

Since cppcodec aims for implementations of common encodings, URL encoding seems like a good candidate to add to the pool.

Implementing it will require a few changes to the codec template, because unlike base-N encodings, URL encoding results in variable length encoded strings (depending on the number of escaped characters).

That means the encoded_size() assertion about returning the exact number of encoded result bytes, without looking at the actual string, cannot hold anymore. The number of encoded bytes can be anywhere between the original size and three times original size. These should probably be represented by an encoded_min_size() and encoded_max_size(), but using either for allocation of result bytes is probably not too great of an idea as they're both likely off.

Having an unknown output size also means we won't be able to use the "pre-allocation & write to data pointer" optimization that mutable data pointer types make use of right now. Using push_back() instead of direct data access for variable-length encodings should be fixable with a bit of thought, though.

Maybe we can have something like an "average encoding ratio" for variable-length codecs like URL encoding, ideally defaulted by the library but customizable by the user. To preserve the same consistent encode()/decode() API that's used for base-N codecs, it might work better to make the average encoding ratio a template argument to the url codec class.

Given the difference in processing, a class structure entirely separate from base-N codecs might be necessary, similar to the planned baseN_num codecs which need a different encode()/decode() API as well.

@GTValentine
Copy link
Contributor

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants