-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream memory and cache fixes. Some book loading optimisations #349
Upstream memory and cache fixes. Some book loading optimisations #349
Conversation
By using LVArray instead of handling it ourselves.
If the document buffer size is too small, ldomTextStorageChunk::ensureUnpacked() can remove itself from the buffer/memory and cause a crash. Includes some additional small code cleanup.
Fix many memory errors detected by gcc -fsanitize=address -fsanitize=undefined -fno-sanitize-recover: heap-buffer-overflow, use-after-free, index out of range, uninitialized value, etc. The tinyNodeCollection::loadNodeData() fix seems to actually solve a long-time issue of crashing when moving nodes around the DOM when there is a cache file. This should allow us to remove the ~15 constructs like: if ( getDocument()->hasCacheFile() ) getDocument()->setBoxingWishedButPreventedByCache(); else { ... do the boxing ...} Switch ldomNode from 'class' to 'struct': they are allocated with alloc()/realloc() and not with the new[] operator, and cleared with memset() - they were a lightweight class without polymorphism nor vtable. So, they can be a 'struct'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crengine/src/lvtinydom.cpp
Outdated
@@ -5107,7 +5107,7 @@ void ldomNode::ensurePseudoElement( bool is_before ) { | |||
} | |||
} | |||
if ( insertChildIndex >= 0 ) { | |||
if ( getDocument()->hasCacheFile() ) { | |||
if ( false && getDocument()->hasCacheFile() ) { // (20200626: this is no more an issue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(raaaah... I always do that mistake... but it's my signature ! :)
Well, my conscience hit me, and I went checking. And it might be not all good... I'll recheck and fix it tomorrow. |
With previous commit, that seems to solve the crashes we used to get when boxing/removing/moving nodes around when there is a cache file, we might no more need these hasCacheFile() checks and _boxingWishedButPreventedByCache flag. (Not fully removing them for now, just in case we notice some other issue and decide this might still be prefered.) We still keep updating _nodeDisplayStyleHash when the CSS display property changes for a node, as a previous boxing might have removed some whitespace only text node when a node was block, that might have some value when the node becomes inline. So, isBuiltDomStale() will continue to be triggered and propose a full reload, even if a proper boxing is now being made - and the rendering could be totally fine.
If not yet rendered (initial loading with XML parsing), we can ignore some global flags that have not yet produced any effect, so they can possibly be updated between loading and rendering without trigerring a drop of all the styles and rend methods set up in the XML loading phase. This is mostly only needed for TextLangMan::getHash(), as the lang can be set by frontend code after the loading phase, once the book language is known from its metadata, before the rendering that will use the language set. With KOReader, this should speed up a book first load where its language is not yet known, and is different from the default or fallback typography language.
The getUnboxedSibling calls are expensive, and we can cache the resulting index of a node in its parent children collection as it won't change. We can cache these values in a node's RenderRectAccessor, as it's not yet used at this point. (On some books with a few dozens of p:nth-child(N) rules, it can reduce the loading and rendering from a few minutes to only a few seconds.)
01708dd
to
1822d96
Compare
Not bumping yet. @virxkane @Frenzie @NiLuJe @pkb: crengine/crengine/include/lvref.h Lines 82 to 90 in ff3640d
It's some kind of wrapper object with reference counting that is supposed to delete when no more reference. But sometimes, we use it to get the real pointer, to later get a member, so we get a new reference (to something that already exists), increase the ref count, keep the pointer we want, decrease the ref count (when the thing gets out of scope) - and it should still exists. At least, that's what I'm undestanding. |
I was reminded by <koreader#349 (comment)> that the current clang-tidy is a few years old. It mainly includes new checks but it might also behave or display more intelligently here and there.
@poire-z If it's a false alarm you can always add something like |
Sorry, I have no idea how to fix this bug. From time to time I use clang analyzer, see tons of errors and warnings, fixing one or two problems and... going to the next task.
No, I fix all bugs that address sanitizer show me, excepting memory leaks. Why you no use asan, clang also supports him? |
I think this is a real bug, but we just don't create appropriate condition for him. Statical analyzer must process ALL possible branches of the code execution, even they unreachable in real life. |
Clang's SA can trigger a wild array of false-positives, so if the codepath taken sounds a bit too wild/unrealistic, feel free to ignore it ;). On the other hand, the runtime ASAN stuff (either Clang's or GCC's, Clang's should be a bit further ahead) should be much more detailed, and actually report stuff that actually happened ^^. |
Actually running scan-build manually will allow you run spawn the web UI, showing you the exact codepath taken to arrive at that conclusion. (No idea if tidy has a similar behavior, and/or if one is a superset of the other. I've never really used -tidy because I don't do C++, but I have used the SA via scan-build [it's as easy as |
Well, I'm not much interested in mastering all these debugging helps :) I'm fine with these analyze on PR, and with valgring when I get crashes and I'm really in the dark.
Yes, that's what I always assumed, and as we don't get crash in real life, I'm fine with that. |
That just vaguely looks like it's looping on the same index twice? |
If you do re-use the Vector contents, you should definitely avoid dangling pointers by assigning to nullptr after delete. (Said the guy who doesn't do C++, so, err, take that with a grain of salt ;p). |
Yeah, I would expect that if I were not advancing in the indexes. |
Another thing:
So, if you by fear added a check for a thing to not be null, clang will assume it can be null and will take paths with it NULL in many other places, where you forgot to live in fear.
|
Yeah, in a for loop, that's always executed at the end of an iteration, so not an issue. It'd be another thing in a while loop, though ;). |
Actually, this did silence these warnings :/ |
Upstream fixes and cleanup from @virxkane , taken from his not-yet-merged https://github.com/virxkane/coolreader/commits/koreader-merge-post branch.
Discussion at #329 (comment).
See individual commit messages for details and comments.
I did not totally remove
hasCacheFile()
/setBoxingWishedButPreventedByCache()
- I'll do that when I'm perfectly sure we won't ever need it.globalHash: exclude some settings on initial loading
On a first book opening, avoid recomputing styles because text lang was set to the default of en_US, and book is FR. (This was hardly noticable on a normal book, but with a crappy book with ugly CSS that take 40 seconds to be applied, load time being reduces from 80 to 40s is noticable :)
CSS: optimize pseudoclass nth-child() and friends
(Hope I didn't messed up, I didn't test them all, I just tested nth-child() with the sample below.)
I have a (not really huge) book which has:
And this book HTML does not even have a single node with these itemtiret, itemalpha class names!
But it was still checking all P. So, for the 282th paragraph in a DIV, it would walk all its previous 281 siblings with the non-recursive unboxing walker, the same way for these 60 selectors!
Loading time before this commit was 108s on the emulator, 200s on my Kobo.
With this commit, It goes down to 4s on the emulator and 6s on my Kobo :)
Just did a test on the emulator with the book mentionned at #276 (comment) (still with the 2 of the 3 huge CSS empty), which has a few such pseudoclasses, but only applied to table elements (so, a lot less than P in the previous book). Not such a noticable gain, 39s+30s => 36s + 28s.
Before:
After:
This change is