-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lexical-markdown] Bug Fix: support link and inline code text formats #7004
base: main
Are you sure you want to change the base?
Conversation
…most matching applies to both text format and text match transformers, instead of just text format transformers
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
size-limit report 📦
|
Just squeezed in another fix for formatted inline code markdown, e.g. **`code`** Didn't want to open a separate PR, as this one is a relatively large refactor and I wanted to make sure this fix is compatible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look too carefully at the logic in the while loop other than to confirm that it should still terminate, I think the unit test coverage is probably sufficient to show that it's not more wrong than the status quo
} else { | ||
return linkContent; | ||
} | ||
? `[${textContent}](${node.getURL()} "${title}")` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this title should be escaped because there could be embedded "
and/or )
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - I escaped this with some simple regex.
Should we consider a library like dompurify
to protect against XSS attacks here? While it's just markdown, XSS could still be an issue if (and depending on how) this is rendered to HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reverted the escaping change. This broke a unit test and did not work when the markdown was re-imported to lexical.
The latter is a separate issue I have experienced, where escaped markdown is not un-escaped properly when imported. E.g. \*text\*
is imported as<span> \*text\*</span>
instead of <span>*text*</span>
I think fixing this properly may not be trivial and is out of scope for this PR anyways, since this PR did not introduce this unescaped title being outputted
result.nodeAfter && | ||
$isTextNode(result.nodeAfter) && | ||
!result.nodeAfter.hasFormat('code') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With how often this expression is repeated it's probably worth making a function to cover $isTextNode(node) && !node.hasFormat('code')
. Checking for non-null/undefined is redundant because $isTextNode
already does that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! I went for canContainTransformableMarkdown
as that conveys the intent and makes the code where it's used more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…pdate call if node can not contain transformable markdown
Fixes #5148. Additionally, it fixes an issue where formatted code blocks (e.g.
are not exported to markdown correctly after they have been imported.
This PR refactors the logic for applying text match and text format transformers to enable support for nested text formats within text match transformers, such as link nodes.
Unlike the previous attempt to fix it, this change does not include any node-specific logic. It fixes the root cause of the issue by ensuring that nested combinations of textmatch and textformat transformers are applied in optimal order.
Before
CleanShot.2024-12-30.at.11.56.42.mp4
After
CleanShot.2024-12-30.at.11.56.03.mp4
The Problem
Previously, the import process for text transformers roughly followed this sequence:
ElementTransformers ($importBlocks)
=>
TextFormatTransformers=if not found>
TextMatchTransformersFor link nodes containing formatted text, the process failed as follows:
Initially, I attempted to solve this issue by adjusting the sequence to prioritize text match transformers:
ElementTransformers ($importBlocks)
=>
TextMatchTransformers=>
TextFormatTransformersWhile this resolved the issue with nested formats, it introduced a new problem in scenarios where links were wrapped by text format markdown, like this:
Now, the link is created first and we get
normal text, link, normal text
. However, the bold text transformer could no longer identify and apply formatting to the entire outer bold range.The Solution
Text format transformers already include logic to identify the outermost match, allowing them to handle scenarios like:
In this case, the bold transformer runs first, followed by the italic transformer, ensuring proper formatting.
However, text match transformers currently lack similar logic. Consider this example:
The existing sequence processes it as:
However, to achieve correct results, the sequence should be:
To address this, the PR introduces logic for identifying the outermost match across both text match and text format transformers.
With this change, text match and text format transformers are treated as equals in priority, allowing their results to be compared directly. This ensures that the outermost match—whether from a text match or a text format transformer—is correctly identified and then applied, enabling seamless handling of nested text transformations.
Implementation details