Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

templates: add raw_text to directly output content #4593

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

avamsi
Copy link
Collaborator

@avamsi avamsi commented Oct 7, 2024

Templates can be formatted (using labels) and are usually sanitized
(unless for plain text output). raw_text(content) bypasses both.

	'hyperlink(url, text)' = '''
		raw_text("\e]8;;" ++ url ++ "\e\\") ++ text ++ raw_text("\e]8;;\e\\")
	'''

In this example, raw_text not only outputs the intended escape codes,
it also strips away any escape codes that might otherwise be part of the
url (from any labels attached to the url content).


Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

TODO: will update changelog and documentation post alignment

@avamsi avamsi requested a review from yuja October 7, 2024 06:47
cli/src/template_builder.rs Outdated Show resolved Hide resolved
Copy link
Owner

@martinvonz martinvonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates: add plain_text to strip formatting

This is missing the most important part (IMO): the "why" (the "what" can at least be inferred from the patch). Unless it's obvious why it's a good thing, I think a commit description should explain what the point is, e.g. by providing a use case.

@avamsi
Copy link
Collaborator Author

avamsi commented Oct 7, 2024

Re: why, apologies -- I was a bit over eager with sending this out as I wasn't sure if I had a good implementation.
I'll amend the commit shortly, but to answer your question in the meantime, I'm trying to hyperlink change IDs in jj log to Gerrit CLs, with something like this:

	'hyperlink(url, text)' = '''
		"\e]8;;" ++ plain_text(url) ++ "\e\\" ++ text ++ "\e]8;;\e\\"
	'''
	'format_short_change_id(id)' = '''
		hyperlink("https://gerrit/q/Id0000000" ++ id.normal_hex(), id.shortest())
	'''

(From avamsi#1)

id.normal_hex() in this example would have formatting applied (from the change_id label) and the corresponding ANSI escape codes would result in a wrong URL. plain_text proposed in this PR strips them (and so, corrects the URL).

@martinvonz
Copy link
Owner

I'm trying to hyperlink change IDs in jj log to Gerrit CLs

Nice!

@avamsi avamsi force-pushed the av/jj_vzlptkmrlxms branch 2 times, most recently from beb0986 to 3de8526 Compare October 8, 2024 18:46
@avamsi avamsi changed the title templates: add plain_text to strip formatting templates: add raw_text to directly output content Oct 8, 2024
cli/src/formatter.rs Show resolved Hide resolved
}
if raw_text {
let mut plain_text_formatter = PlainTextFormatter::new(formatter.raw());
write_data(&mut plain_text_formatter, last_pos..pos)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, is_raw: bool will have to be passed to the callback. The "indent" template function prints prefix by using the callback formatter argument, and the prefix part should be colorized if any.

BTW, I think it's okay to disable raw_text() within reformatting template functions. We can add the support later if needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without this support, raw_text won't work with wrapping :(.

The (/ your) comment in write_indented says prefix inherits current labels, which arguably is nothing while raw_text is in play? Basically, it doesn't sound to me like we're breaking the existing behaviour.

But either way, I think indent() is the only function that needs this? And do you think this should be blocking or is it okay to fix in a follow up?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without this support, raw_text won't work with wrapping :(.

Right. However, wrapping wouldn't work correctly if raw_text() contained escaped characters. In your use case, I assume you wouldn't want to count raw_text() content as visible characters, so it might make sense to put them as FormatOp::Raw(Vec<u8>). This suggests that raw_text() might be a bad function name.

The (/ your) comment in write_indented says prefix inherits current labels, which arguably is nothing while raw_text is in play?

With the current implementation, indent text will be printed by PlainTextFormatter if the trailing text is raw. That's the problem I pointed out. Try something like indent(label("error", "foo"), raw_text("bar\n")).

And do you think this should be blocking or is it okay to fix in a follow up?

The problem of indent() isn't a blocker, but I guess we would need a non-trivial rewrite to fix the problem. So it's nice if you can address it within this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, wrapping wouldn't work correctly if raw_text() contained escaped characters.

Could you elaborate on this? I was going through the code earlier and there was a comment (again, yours :P) saying textwrap handles escape codes, so I just didn't worry about it. My testing now seems to confirm this as well:

    #[test]
    fn test_wrap() {
        assert_eq!(
            wrap_bytes(
                b"\x1b]8;;http://example.com\x1b\\Example Text\x1b]8;;\x1b\\",
                12
            ),
            vec![b"\x1b]8;;http://example.com\x1b\\Example Text\x1b]8;;\x1b\\".as_ref()]
        );
        assert_eq!(
            wrap_bytes(
                b"\x1b]8;;http://example.com\x1b\\Example Text\x1b]8;;\x1b\\",
                10
            ),
            vec![
                b"\x1b]8;;http://example.com\x1b\\Example".as_ref(),
                b"Text\x1b]8;;\x1b\\".as_ref(),
            ]
        );
    }

(I haven't tested if the 2nd example above is broken by graph nodes in log, but I'm satisfied with this level of support from textwrap.)


With the current implementation, indent text will be printed by PlainTextFormatter if the trailing text is raw. That's the problem I pointed out. Try something like indent(label("error", "foo"), raw_text("bar\n")).

I understand that but I'm not sure why you say it's a problem -- behaviour of indent(label("prefix", "foo"), label("rest", "bar")) and indent(label("prefix", "foo"), raw_text(label("rest", "bar"))) seems consistent to me (which is, the formatting (or the lack of) of the actual content affect the prefix as well).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, wrapping wouldn't work correctly if raw_text() contained escaped characters.

Could you elaborate on this? I was going through the code earlier and there was a comment (again, yours :P) saying textwrap handles escape codes,

Oh, I didn't remember textrap() could parse ANSI escapes (because the existing code doesn't depend on it.) Then, the remaining issue is escape sequence containing \n? If we add some padding function, it won't work, but we don't have such function right now.

Anyway, if the primary use case is to insert (invisible) escape characters, it should be easier to process these characters as metadata rather than content to be reformatted. The hard part is to find a short descriptive name for it. :)

With the current implementation, indent text will be printed by PlainTextFormatter if the trailing text is raw. That's the problem I pointed out. Try something like indent(label("error", "foo"), raw_text("bar\n")).

I understand that but I'm not sure why you say it's a problem -- behaviour of indent(label("prefix", "foo"), label("rest", "bar")) and indent(label("prefix", "foo"), raw_text(label("rest", "bar"))) seems consistent to me (which is, the formatting (or the lack of) of the actual content affect the prefix as well).

The expected out put for indent(label("prefix", "foo"), raw_text(label("rest", "bar"))) is <prefix>foo</prefix>bar. You'll get foobar because foo is printed through the PlainTextFormatter.

cli/src/formatter.rs Show resolved Hide resolved
cli/src/template.pest Outdated Show resolved Hide resolved
Change-Id: Id0000000ba8346364d31fa2419922e336e985102
@avamsi avamsi force-pushed the av/jj_vzlptkmrlxms branch 2 times, most recently from 6004d5f to f8a278b Compare October 9, 2024 09:07
@avamsi avamsi requested a review from yuja October 9, 2024 09:21
With `\e` as a shorthand for `\x1b`.

Change-Id: Id000000040ea6fd8e2d720219931485960c570dd
Not all formatters (namely FormatRecorder) are supported yet.

Change-Id: Id00000004492dbf39e50f3b7090706839d1d8d45
Change-Id: Id0000000e1d0d4f5257723c580231db8732d62ad
cli/src/template_builder.rs Show resolved Hide resolved
cli/src/template.pest Show resolved Hide resolved
Comment on lines +2021 to +2024
insta::assert_snapshot!(env.render_ok(r#""\e""#), @"␛");
insta::assert_snapshot!(env.render_ok(r#""\x1b""#), @"␛");
insta::assert_snapshot!(env.render_ok(r#""\x1B""#), @"␛");
insta::assert_snapshot!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you move the escape syntax test to test_string_literal() in template_parser.rs?

It's also better to add a few non-ASCII characters (e.g. \xff), invalid escapes (e.g. \x, \x0, ..)

lib/src/dsl_util.rs Show resolved Hide resolved
lib/src/dsl_util.rs Show resolved Hide resolved
Ok(content) => formatter.raw().write_all(content.as_bytes()),
Err(err) => formatter.handle_error(err),
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears that requiring T: Template is better. (I thought String version would be way simpler, but I was wrong.)

impl<T: Template> Template for RawTextTemplate<T> {
    fn format(&self, formatter: &mut TemplateFormatter) -> io::Result<()> {
        let rewrap = formatter.rewrap_fn();
        let mut raw_formatter = PlainTextFormatter::new(formatter.raw());
        self.0.format(&mut rewrap(&mut raw_formatter))
    }
}

@@ -1119,7 +1120,7 @@ fn builtin_functions<'a, L: TemplateLanguage<'a> + ?Sized>() -> TemplateBuildFun
map.insert("raw_text", |language, diagnostics, build_ctx, function| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder: you'll need to update the templates doc.

}
if raw_text {
let mut plain_text_formatter = PlainTextFormatter::new(formatter.raw());
write_data(&mut plain_text_formatter, last_pos..pos)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without this support, raw_text won't work with wrapping :(.

Right. However, wrapping wouldn't work correctly if raw_text() contained escaped characters. In your use case, I assume you wouldn't want to count raw_text() content as visible characters, so it might make sense to put them as FormatOp::Raw(Vec<u8>). This suggests that raw_text() might be a bad function name.

The (/ your) comment in write_indented says prefix inherits current labels, which arguably is nothing while raw_text is in play?

With the current implementation, indent text will be printed by PlainTextFormatter if the trailing text is raw. That's the problem I pointed out. Try something like indent(label("error", "foo"), raw_text("bar\n")).

And do you think this should be blocking or is it okay to fix in a follow up?

The problem of indent() isn't a blocker, but I guess we would need a non-trivial rewrite to fix the problem. So it's nice if you can address it within this PR.

cli/src/formatter.rs Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for future reference, can you include the "why" in raw_text() patch description?
#4593 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants