-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected sourcepos. #301
Comments
That is a bit surprising! It's actually almost per upstream. Here's cmark-gfm: $ cat o
[AB
CD](/)
$ build/src/cmark-gfm o -t xml --sourcepos
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document sourcepos="1:1-2:6" xmlns="http://commonmark.org/xml/1.0">
<paragraph sourcepos="1:1-2:6">
<link sourcepos="2:1-2:6" destination="/" title="">
<text sourcepos="1:2-1:3" xml:space="preserve">AB</text>
<softbreak />
<text sourcepos="2:1-2:2" xml:space="preserve">CD</text>
</link>
</paragraph>
</document>
$ build/src/cmark-gfm o --sourcepos
<p data-sourcepos="1:1-2:6"><a href="/">AB
CD</a></p> So, in terms of the AST, comrak is doing the same thing as upstream, as shown in I'd be happy with fixing this in either direction; either fix how we record sourcepos on the link node, or stop outputting in it HTML. (It's possible that "fixing" the former might have unintended consequences.) |
I see. I prefer the latter(stop outputting). |
Me too! PRs happily accepted; I don't have the time for this right now myself. |
Wait...the idea is to not have I tried running it against
|
In general, I'd rather "matches the upstream we explicitly follow" than not. Right now we have "doesn't match upstream and is incorrect". I'd accept a solution either way. |
But it does match the upstream, doesn't it? Block-level sourcepos works correctly I think. Inline sourcepos is a great enhancement. I'd certainly rather see a fix, but if that's not feasible at the moment, then how about an option to enable it? Those that want it can enable it, and work on a fix. |
#439 fixes this, but only by no longer reporting sourcepos information for inlines in output HTML. (Block-level is unaffected, and the data's still there in the AST or XML output if you want it.) See #439 (comment) for comments on effort required to get inline sourcepos working consistently. |
@kivikakk ouch. This is a huge breaking change for us. We've built significant functionality on top of inline sourcepos. Its removal is a real problem for us. I understand it's not 100% accurate, but it's been pretty darn accurate for most things. Also releasing it along with the other great fixes you just put in means we can't use them. Not sure what I'm going to do yet. Probably the only thing I can do is build our own HTML outputter. Or if an |
I'd be happy to add it back in with an option, yeah! But "sometimes inaccurate" to me reads as "garbage in, garbage out"; of course, if your pipeline means those cases aren't hit, it's not such a big deal, but we can't even enumerate them all because there's some fundamental brokenness here. (It's why, I'm guessing, they aren't exposed in So, happy to take an option to re-add them in the HTML output, but it needs to be called something like |
Thanks @kivikakk 🙇. I agree with naming an option I'm not happy with "garbage in, garbage out" either. So far we haven't hit these particular cases, but as we come to rely more heavily on it, I'm sure we will eventually run across them. I'll see about reallocating some of my time to work on some type of fix, though I may have to lean on your expertise 😄 |
To better clarify what we use it for, in GitLab we have a rich text editor for our markdown fields (comments, descriptions, etc). Markdown is the single source of truth, and we convert from markdown to rich text and back. There is actually a button in the edit field that allow you to switch back and forth. Sourcepos allows us to have fidelity when converting between to the two representations - in general we don't want to change markdown the user has created but not edited. It's still a work in progress, but it's evolving. You can test it out in any comment field in an issue, for example in https://gitlab.com/gitlab-org/gitlab/-/issues/456980 |
I don't quite understand how sourcepos is used in that conversion (my experiments resulted in changes on every roundtrip), but I'm glad it's helpful! |
Yeah, it's very much a work in progress 😅 Though I am going to have to go back and adjust expectations. |
:D I see! It looks like you might be after more CST-like support as well (thinking of #422 here). In general I don't mind going that route, but for sourcepos fidelity, the big change forewarned about in #439 (comment) will be unavoidable. |
Possibly, though I don't understand the ramifications of that yet, particularly since at the moment it's the generated HTML that gets converted into the editor.
yeah I figured |
Hi. "sourcepos" is exactly what I was looking for.
What do you think of this result?
markdown
my expected
comrak v0.18.0
Similar pattern.
markdown
comrak
cmark
The text was updated successfully, but these errors were encountered: