increment.decode('utf-8') issue #85

mdlsunny · 2017-08-21T03:47:59Z

Line 210 and 211 in sortphotos.py:
increment.decode('utf-8') encountered issue because it truncated bytes in size = 4096.
encoding 'utf-8' can have various bytes (1 to 4), thus increment = os.read(fd, 4096) may cut through a character. I found the issue while sorting through 80,000 photos, many of which had Chinese characters as names. In my case, because decode('utf-8') was done on increment, the issue occurred.

I think the decoding should be done on the full output: output += increment to be done first and then output.decode('utf-8') to be done afterward. I modify the source code and tested it. It worked.

Thank you for your effort to make such a great tool!

andrewning · 2019-07-27T23:52:12Z

Sorry for delay. That threw an error for me with Python 3. I'm not well versed in encoding/decoding to know what the best fix is here offhand.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

increment.decode('utf-8') issue #85

increment.decode('utf-8') issue #85

mdlsunny commented Aug 21, 2017

andrewning commented Jul 27, 2019

increment.decode('utf-8') issue #85

increment.decode('utf-8') issue #85

Comments

mdlsunny commented Aug 21, 2017

andrewning commented Jul 27, 2019