-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Producer json & base64 support & better test cases #295
base: master
Are you sure you want to change the base?
Conversation
This would be really helpful to copy messages between topics or brokers! Nice work! Another feature that would be amazingly helpful in this scenario would be to preserve headers. WDYT? |
It's a good idea to be able to keep the headers. I'll have a look at it when I have time. (Haven't used headers before) |
Hi @qiujiangkun I've implemented the header production from JSON over in this branch: https://github.com/masoncj/kafkacat/tree/producer_json_headers However, I'm seeing an issue with this PR (absent any of my code above) where the produced data seems to be corrupted:
Results in some messages like:
Any thoughts on where this corruption might be coming from? |
@masoncj It's probably because |
@qiujiangkun, I was able to identify and fix the source of the corruption I was seeing above. It didn't have anything to do with jq or line breaks ( I used
(I ❤️ valgrind.) I think the issue here is that the yajl parser is allocated and freed in I have a branch here that fixes these issues. It's a bit messy because I have to manage both the buffer and the parser memory pool, so I use a small allocated stub to store both of these references and free them in the message sent callback from librdkafka. I'd welcome stylistic comments here. With these changes valgrind reports no errors from |
Fix JSON produce memory issues
Support headers in JSON production
@edenhill this is proving really useful for me in copying data between topics/servers. For example:
(Note messsage.max.bytes required; see #137 (comment) ) Could we please discuss what it might take to get this merged? Thanks so much. |
This would be very useful! |
This would definitely be a very welcomed feature! |
That'd be a very nice addition but no response for the last 2 years. Is the project somehow on hold? |
@JakkuSakura, very nice work! Thank you. |
I'm seeing malloc assertions when trying to produce more than a few messages using this branch. First I tried using a pipe.
When that failed, I consumed a large number of messages into a file and then tried to produce by reading from stdin, but I'm not able to get more than 9 messages (about 25kB) before I see the same assertion. Consume some messages
Try to produce a few to the topic; this is successful
Try to produce them all with
Search for the first message that fails
Is the 10th message corrupt? No, this is also successful
How much data are we talking about?
Concatenating @masoncj 's test data together three times didn't generate the error. I don't have a reproducible test case I can share (yet). Any ideas on how this could be data-dependent (so I can generate a test case to share)? For those interested in core dump, the binary was built with
|
Thanks for testing and reporting. It could be passing the wrong pointer when re-allocating memory chunk. So that when you process beyond certain bytes, it fails to re-allocate. I'm becoming really busy recently. I will have a look and get it fixed and resolve conflicts when I have time |
Hello. Sorry, could you please continue and resolve outstanding issues in this PR? |
# Conflicts: # Makefile # json.c # kcat.c # tests/0000-unit.sh # tests/0002-delim.sh # tests/helpers.sh
I just merged the code with master branch. however, I don't have test env atm. |
Now option
-J
works in producer mode.-B k
for keys,-B v
for payloads,-B a
for both key and payloads to to be encoded/decoded in base64. base64 works in both stdin and json.I wrote this producer json and base64 support for copying binary topics. There is no lisencing issue, since the new library base64 is in public domain(and I have modified it a bit).
There are still some issues, though. I can't figure out the memory management of this project, so there may be some memery leakage.
I also wrote 2 test cases and beautified other old 2 test cases. I removed the assumption of minimal default partition number of 3.