Feature Proposal: Add Angle Brackets (< and >) to format tokens

Proposed by Thomas Weigert

Motivation

Thomas proposed this to TWiki, and for consistency with TWiki we need to consider it as well, plus it's a sensible idea.

Some character sequences in format strings may be interpreted as parts of HTML when the formt string is used literally, when we actually only want them expanded as HTML in the output strings, but not otherwise.

ALERT! Correction: The issue is not that these characters are interpreted as HTML. The issue is that tags are expanded by TWiki/Foswiki early rendering phases within format strings too early, and therefore break the tags they are embedded in.

Description and Documentation

Add the following format tokens:
Token Meaning Expands to
$lt less than/left angle brace <
$gt greater than/right angle brace >
$amp ampersand &

Thomas didn't propose $amp - I added it because I've had issues with it myself.

Impact

There may be some effect on wiki applications that do not escape dollars, resulting in changes to the expansion of some strings.

Implementation

Simple extension to Foswiki::expandStandardEscapes

-- Contributors: CrawfordCurrie, KennethLavrsen - 24 May 2009

Discussion

Another way to look at this is to aim to deprecate the $ escape? syntax ASAP, and move instead towards a more intelligent escape, such as backslash. This would allow any character to be escaped by a preceding \, a la all C-based programming languages. This may be a bit geeky, though. Also, there's an issue with this sort of escape, in that it doesn't help you with HTML entities e.g. \& will still be expanded as an entity when the format string is used literally (though \&\; won't be).

-- CrawfordCurrie - 24 May 2009

from a comment i commited today in Search.pm

 
 the implementation of %FORMAT{}%
 
+TODO: rewrite to take a resultset, a set of params? and a hash of sub's to
+enable evaluations of things like '$include(blah)' in format strings.
+
+have a default set of replacements like $lt, , %, $ etc, and then
+the hash of subs can take care of %MACRO{}% specific complex to evaluate replacements..
+
+(that way we don't pre-evaluate and then subst)
+
 =cut
 
 sub formatResults {

I'm not thrilled about the $D etc - it requires people to remember more oddities.

I think Crawford meant to say _ deprecate the $ escape syntax ASAP_, leaving the $topic syntax, but I'd like confirmation..

-- SvenDowideit - 24 May 2009

I hadn't even considered the $topic syntax, but yes, i was only referring to the escapes. $D isn't particularly memorable, but it's much faster to type and IMHO easier on the eye than $ etc.

-- CrawfordCurrie - 24 May 2009

I would love to see an easier syntax for escaping but for me, the proposal from above makes it harder.

$lt()i$gt$P()WIKINAME$P$lt/i$gt like$D $Q()rock $amp roll$Q

Especially $lt()i$gt$P() is a nightmare and totally unreadable. While an $D on its own is easier to read it becomes harder to read if combined with stuff like $lt.

-- CarloSchulz - 24 May 2009

It is important to keep Foswiki an direct upgrade path from TWiki. I support adding the escapes that Thomas Weigert suggested (the final proposal version using $gt $lt and $amp).

I do not support the deprecation of the horror % in favour of a $P etc. I cannot see $P as easier to read than %. And nothing really gets deprecated in a way that it gets forgotten. The only thing you get out of it is additional syntax that causes additional confusion. I'd rather see an other escape method like \$ or \\$ or similar.

So I would suggest to extend with the 3 new and not add additional escapes that does adds to the confusion.

And for the escapes that will make their way to TWiki I would do it also on Release branch as it is a pretty safe feature we add.

-- KennethLavrsen - 24 May 2009

TWiki didn't discuss $amp, though I'm sure it's just because they didn't think about it yet.

I don't object to dropping $P, $Q and $D - they were never intended as replacements, just shorthand.
$lt()i$gt%WIKINAME$P$lt/i$gt like$ "rock $amp roll"
is fine with me, however clumsy it is.

Regarding \< etc, bear in mind that the purpose of the escaping is to make those specific characters "disappear" from the input stream. \<img> will still be interpreted as an HTML image tag, despite the \. So the escaping scheme has to support substituting these characters - hence $lt.

I look forward to better ideas for escaping characters.

-- CrawfordCurrie - 24 May 2009

Please rethink why we have these standard escapes, i.e. %, $ and ==: the reason for them is to delay parsing of argument strings in a macro until the surrounding statement has finished.

By default the parser goes left-to-right-inside-out. That is anything in argument position is expanded first, which is making it a kind of "call by value". This is where TWiki's and Foswiki's parser is most different from other wiki engines like mediawiki, where wiki markup is parsed left-to-right-outside-inside, thus making it a "call by reference" in a way. And that's why translating mediawiki to foswiki or twiki is a non-trivial task. Most wikipedia apps are not translatable to Foswiki apps at all just for that reason. It gets incredibly complicated at least if you don't get away to ignore these differences and you have to translate/rewrite a nontrivial set of mediawiki apps to foswiki.

My experience with wiki apps has shown that you end up writing the bulk of the application logic on "depth one" escape level (depth zero is direct macros, depth two is stuff like $percent - the most ugly beast in town). It would be great to write the wiki app in a more direct way, that is don't have a need for escape sequences at all, i.e. $, % and ==.

The other half of the $xxx() stuff in %SEARCH is that this establishes a programming language of itself inside the format strings which is totally different from the rest of TML. Their justification is completely independent from "standard" escapes.

From that background I can not see the advantage of obscuring a wiki app with even more escape sequences, even those that do not participate in the parsing logic the way described above. For instance < is not relevant to the Foswiki parser. It only matters for xml. I have not read Thomas motivations for $lt; and $gt but I suspect it has something to do with twiki and foswikis weaknesses to generate proper xml, i.e. rss/atom as the renderer (not the parser) gets into the way all the time. Might be a different issue. But that's where I have seen a need for $lt;, $gt etc. This however should be solved at the real roots and not by obscuring the markup language even more.

The only reason not to add my self to ConcernesRaisedBy would be to keep up with "backwards" compatibility. Even this must stop somewhere when those guys on the other project add more nonsense.

Finally, I must say, I can't see a real reason for this feature. Please try to explain a bit more.

-- MichaelDaum - 24 May 2009

Okay, I've read Thomas' justifications for $lt and $gt: it is because of =<section> which is an SectionalEditPlugin feature. The motivation similarly applies to <noautolink> or <verbatim>. So question is how to prevent those xml tags from being parsed before the output they are about to surround is already generated. Answer is: use $nop. No beauty, but his proposal isn't either.

-- MichaelDaum - 24 May 2009

I understood his goal, and felt I sympathised with it. However you are right, the justification is indeed very weak. noautolink and verbatim are removed from the parsing process very early, so even if you were able to escape them, I'm not sure they would work. I certainly don't want to encourage anyone to add more XML tags.

Removed my commitment.

-- CrawfordCurrie - 24 May 2009

Then I add myself as committed developer. If Twiki introduces this simple little enhancement we should too. We are not in a position to dominate. We are still in a "win customers and developers over" mode.

While I support any action towards making nested searches more simple, I do not see it as a reason not to follow this simple no brainer enhancement of standard escapes.

So Michael are you going to open a feature proposal that introduces a better feature for nested searches and similar with yourself as committed developer?

-- KennethLavrsen - 24 May 2009

@Crawford, wrt noautolink oh I see, true.

@Kenneth, no. However I block nonsense from TWiki-land to enter Foswiki. I can't follow your arguments that this has to enter Foswiki just because they are TWiki.

While this is a simple change, it still does not make much sense. If they accept that proposal for TWiki then they have to suffer from even more cruft all by themselves. There's nobody over there that would add any arguments against it, unfortunately.

As far as I see this is all about SectionalEditPlugin spuriously offering a section to be edited while it actually is a format string that does not relate to a topic section to be edited. This is by no means a basis for a core change no matter how small.

-- MichaelDaum - 24 May 2009

Raising concern stops the 14-day clock. Noone can block anything. I want this out in the community for a vote then.

And I still miss the proposal that addresses Thomas Weigarts problem. I am at the moment trying to get Thomas to cross over to Foswiki and this could be the reason why he and many others will not. This is important.

And I do not agree the feature is silly or that we have to suffer cruft.

We have a potential community member asking for help. If TWiki adds this and we do not we loose users. That is a fact. And it is two bloody code lines we are talking about that are easy to test and easy to add to the existing unit test suite.

-- KennethLavrsen - 24 May 2009

I can imagine even fewer code that makes no sense in the core and that would easily be covered by a test suite.

Talking to Thomas about his section problems first is a good idea. I suspect that wiki apps and topic net data are mixed in an "unhealthy" way thus causing these problems.

@Thomas, why do you have a <section> tag inside a format string?

-- MichaelDaum - 24 May 2009

Let me give you some more background how I ran into this problem....

I have many searches using FormQueryPlugin which use nested searches. the way to do this in FormQueryPlugin is to have the nested search in the format parameter. In order for it not to be expanded prematurely, escapes have to be applied (just putting $nop there is not good enough).

Sometimes these embedded searches return something that cannot be put into a table cell, for example, a bullet list or another table. In this case, one can wrap the MultiEditPlugin around and thus can put the bullet list, or table, etc., into a single cell. (Remember, typically the output of a search goes into a table.)

Here is where then the problem occurs. MultiEditPlugin does some manipulation of the tags before the common tags are expanded. But this messes up the embedded search by introducing a quote character prematurely. The right thing to do is to escape the tag for MultiEditPlugin.

This is consistent with the rest of the platform (escaping $percnt, $dollar, etc., in these situations).

After I ran into this problem, I realized that there are other scenarios also that fall victim to this problem.

For your entertainment, here is an example TML section that needs that escape:

%TABLEFORMAT{name="ptable2" header="|*Project*|*Responsible*|*Reports*|*Minutes*|*Includes...*|" format="|[[Development.$percntCALC{$dollarLISTJOIN(P,$Label,roject)}$percnt][$Program]]|$percntDOANDSHOWQUERY{\"topic='$percntCALC{$dollarLISTJOIN(C,$dollarSUBSTITUTE($Label,<nop>),harter)}$percnt'\" web=\"Saydo\" format=\"$dollarProjectmanager\"}$percnt|$Reports| $percntCALC{ [[Notes.$dollarLISTJOIN(M,$Label,eetings)][$percntICON{\"days\"}$percnt]] }$percnt | $ltsection edit=0&gt; $percntDOANDSHOWQUERY{\"topic='$percntCALC{$dollarLISTJOIN(C,$dollarSUBSTITUTE($Label,<nop>),harter)}$percnt'\" web=\"Saydo\" format=\"$dollarGoalStatement\"}$percnt  |" sort="Label"}% 

Note the nested DOANDSHOWQUERY search in the last column. The < section $gt; tag protects the value that is being returned from messing up the resultant table, as it is a bulleted list in some cases.

You might argue that one could also change FormQueryPlugin to have a protect parameter which prevents an indicated column form messing up the table in that way.... (sort of like DBIQueryPlugin protects expansion of returned values). But then one also should change standard Search, and who knows what else....

-- ThomasWeigert - 25 May 2009

Michael, I do not understand your opposition here. The suggestion is a minor one, requiring exactly one line change in TWiki.pm (or whatever the counterpart is in Foswiki). It is consistent with the usage of escapes in other places.

I am fine with somebody coming up with a scheme not requiring escapes, but until then, I need something to help me here....

Note that this is not an issue that can be fixed with the parser, as you initially explained, as the problem has to do with the different rendering phases, not with parsing.

Secondly, this problem will apply to the use of any tag looking like an html tag which is manipulated before the common tags expansion and introduces, for example, quotes as its result.

The html like tags are not that rare in plugins by the way.

-- ThomasWeigert - 25 May 2009

Oh, by the way, I did not propose & as an escape, as I am not aware of that character occurring in tags. Remember, this is not an issue of escaping HTML parsing, but of preventing an earlier phase of the rendering loop to make changes that screw up escaped portions to be handled in a later phase of the rendering loop.

-- ThomasWeigert - 25 May 2009

No, I proposed $amp because I have had exactly the same problems, but with HTML entities. Of relevance is another plugin I have behind a corporate firewall. This plugin extracts data from a ClearQuest database and presents it to the user. The extracted fields are frequently post-processed by other plugins to enhance presentation of the results, and it is from here that I encountered the $amp issue.

I have tried to think of a way to address Thomas' requirement with another core change, but all I can think of involves changes to the plugins. In an ideal world, I would drive back on the plugins authors to improve the way they do things; but that's a lot harder than accepting what is really a minor change into the core.

BTW I coded this up quickly, early on in the discussion, and it's 3 lines of code, 3 lines of tests, and about 10 lines of doc. so anyone could manage to do it.

Oh, and another point. $lt, $gt and $amp are likely to be pretty rarely needed, at least in comparison to $percnt etc. An alternative approach is to accept that these format tokens will only be used by the more expert developer, who can tolerate a bit more geekiness. Given that, why not adopt the same approach as taken for HTML entities, and use:
$aaa; (note the terminating ';') Maps to whatever 8-bit character the HTML entity &aaa; would expand to
$lt; <
$gt; >
$#nn; Maps to whatever 8-bit character the HTML decimal entity &#nn; would expand to
$#37; %
$#60; <
$#62; >
$#38; &
Geeky, I know, but it extends cleanly to other characters that may need to be escaped, and is highly unlikely to break existing apps. This is a little bit more complex than the simple $lt proposal, but is also a heck of a lot more extensible and consistent.

-- CrawfordCurrie - 25 May 2009

@Thomas, thanks for the example that led to this discussion. My advice, (1) format the table inside the QUERY; this will allow you to eliminate the second QUERY call as well as far as I can see. (2) Use HTML tables. Don't use TML tables to hold search results. They are too fragile.

-- MichaelDaum - 25 May 2009

Michael, I don't know what you mean by "formatting the table inside the query". The reason for the second query is to do a query on the results of the first query. That is how FormQueryPlugin does a left join. You cannot do this "inside the query".

Regarding the discussion on TML tables vs. HTML tables. For one this is religion. But more importantly, I am doing further processing on the TML tables with another plugin. (I am applying the ExcelExportImportPlugin to allow the resultant table to be extracted into an excel file.) This is not possible with HTML tables. As are many other additional formatting one can apply to TML tables easily.

Most importantly, you don't really have presented an argument other than "I don't like adding escapes." Can you concisely express what your concern is?

-- ThomasWeigert - 25 May 2009

Crawford, you suggestion is fine, except unfortunately this is then inconsistent with $percnt, $dollar, etc. You could make the same table omitting the trailing semicolon, sayin, "an escape $xxxx maps to whatever &xxx; maps to".

-- ThomasWeigert - 25 May 2009

Hmm. Crawford's proposal is not bad at all.

The terminating ; would be mainly to have a well defined way to parse the value and to have a clear rule to $eaf is for sure $ followed by UTF8 eaf vs $ea followed by the letter f.

If we go through with Crawford's proposal - while still keeping the old $percnt etc - we will once and for all have covered any special case people may run into in future. And no more coming back to adding more and more strange $strings.

It is only an advanced application builder that will need this so I agree that some higher NerdoMeter score is OK

-- KennethLavrsen - 25 May 2009

Thomas, exactly, it's deliberately consistent with HTML entities, rather than the slightly mad unclosed text string syntax of $dollar etc. The trailing semicolon has to be there for the $#99; number formats, because otherwise a following number would merge.

-- CrawfordCurrie - 25 May 2009

While Crawford's idea is super general, I am not sure whether the generality is needed or desired. We don't really want to encourage this escaping.... it just so happens that you do need to be able to escape things that can mess with your content when expanded to early. The only things I am aware of that can do that are: dollar sign, double quote, percent sign, and left angle bracket. For separation, the $nop or $nop() syntax can be used... Admitted, it would have been smarter if these escapes had been terminated, but they are not. It still appears to me adding one more escape ( $lt) in its consistent manner is less burden than adding the very nice and generic feature, but making things inconsistent.

But I am not hung up on that.... Currently I am using $lt.

-- ThomasWeigert - 27 May 2009

True, the () syntax has been used as a separator, however clumsy it may be. I don't like it, personally, because () is conventionally function call syntax. However it would be more consistent. Despite what you say, Thomas, TML does not have a static definition. A plugin might add a special interpretation of another character. If we support the "super general" approach, this is a no-brainer because $#NN; will allow it to be escaped.

There's an obvious compromise here.
  1. Support unterminated names for a subset of entity names, including $lt, $gt and $amp
  2. Support optional () termination of these named entities
  3. Support $#NN; numerical entities, to mop up anything else that needs to be escaped.
-- CrawfordCurrie - 27 May 2009

I would like to come to a conclusion on this.

The NEED we have is actually only the implementation of $lt.

But if we do $lt we should also do $gt

That is all Thomas ever needed. And I would like to give this little feature to Thomas so we can help him cross over to Foswiki where he works.

Is everybody OK with me adding just those two now for 1.0.6, and then we can continue to see what we want to add in 1.1

It would be cool to give this to Thomas in 1.0.6. I would love to have Thomas with us. And this may not do it alone but be a good gesture and as Thomas would be able to cross over without having to hack things up.

-- KennethLavrsen - 05 Jun 2009

From IRC Michael is still against the proposal so we will have a vote.

I have narrowed down the feature proposal to implementing ONLY $lt, $gt, and $amp

So we vote, should Foswiki::expandStandardEscapes be extended by adding $lt, $gt and $amp

-- KennethLavrsen - 05 Jun 2009

I liked Crawford's general proposal a lot and I don't like exceptions. I voted yes to this feature as Kenneth supports, but we need to review TML consistency and document its specs. This documentation would, for example, support the development of new faster parsers or help extension developer to keep their plugins consistent with core.

-- GilmarSantosJr - 08 Jun 2009

I like Crawford idea, but I understand Kenneth will to help Thomas. Thomas, could you just change your TWiki proposal to use Crawford idea instead of the $lt one? This would give the best solution.

-- ColasNahaboo - 08 Jun 2009

The cruftines of TML is something we need to live for the time being, until someone implements away to have different parsers/syntaxes to live at the same time Foswiki to maintain backward compatibility.

We can at least ease the pain with minors enhancements like this.

-- RafaelAlvarez - 08 Jun 2009

Colas, I think it would be unfair to the ones that already voted to change the proposal now. And nothing prevents that we later also implements Crawford's extended proposal But I needed to get the initial simple approach decided now because the simple proposal is intended for 1.0.6. Not only to help Thomas crossing over but also to maintain compatibility with TWiki for a little while longer while we still see many crossing over.

Remember that this is not a personal vote. It is perfectly OK to vote No. It is also OK to remain neutral. And it is OK to extend the syntax later to what Crawford proposed and maybe even something better in a new feature proposal.

It is my hope that we will develop a much better syntax for nested searches than the horror $dollardollardollar we always end up with even with the existing escapes syntax.

I think this is where the effort should go for 1.1 or 2.0. I encourage people to start thinking about how to extend the syntax so that searches can happen based on searches without using any escapes at all and with much more clarity. If Foswiki can get a much better syntax for this, people will choose that in future instead of using these escapes. We saw how significant a change it was when we introduced the query searches.

Let us see this vote to the end and respect what ever outcome it is. No matter what the result is, the democratic release process wins over the dumb BDFL approach we all left.

-- KennethLavrsen - 08 Jun 2009

VOTE

should Foswiki::expandStandardEscapes be extended by adding $lt, $gt and $amp

Name Vote (Yes/No)
KennethLavrsen Yes
MichaelDaum No
MichaelLorenzen Yes
EugenMayer Yes
OliverKrueger No
WillNorris No
GeorgeClark Yes
MichaelTempest Yes
JanDreyer Yes
AndrewJones No
ArthurClemens No
GilmarSantosJr Yes1
RafaelAlvarez Yes1
Total No: 5
Yes: 6
Yes<sup>1</sup>: 1
Yes<sup>1</sup>: 1


Proposal has been accepted with 8 votes for and 5 against.

-- KennethLavrsen - 11 Jun 2009

and was implemented by Kenneth in June 2009 for 1.0.6

-- SvenDowideit - 22 Nov 2009


Topic revision: r43 - 22 Nov 2009, SvenDowideit
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy