Item8846: Warn about template errors that causes newlines after closing html tag

pencil
Priority: Urgent
Current State: Closed
Released In: 1.1.0
Target Release: minor
Applies To: Engine
Component:
Branches:
Reported By: ArthurClemens
Waiting For:
Last Change By: KennethLavrsen
Each time when someone edits a template, newlines get inserted by the text editor. This results in newlines at the end of the text, that get rendered as <p /> tags.

To make this less fragile, we should remove newlines after the closing </html> tag.

-- ArthurClemens - 05 Apr 2010

The problem with this is that the rendering code doesn't actually know it's generating HTML (and in the case of RSS, it isn't).

How about we add a <trim /> directive that will hoover up any whitespace on either side of the directive in the rendered output? The directive could be added before and after the first and last tags (or even in the body, if it's used there)

Alternatively a %TMPL:TRIM% directive could instruct the TMPL reader to do the same if the directive is seen in a template file it has read. This would have the added value that when used in a template file containing only %TMPL:DEF blocks, it would make the file a NOP, allowing us to lay the files out more readably.

-- CrawfordCurrie - 06 Apr 2010

KISS:

--- lib/Foswiki.pm      (revision 7104)
+++ lib/Foswiki.pm      (working copy)
@@ -743,6 +743,9 @@
     # Call final handler
     $this->{plugins}->dispatch('completePageHandler', $text, $hdr);

+    # clean up html
+    $text =~ s/(<\/html>).*?$/$1/gs if $contentType eq 'text/html';
+

-- MichaelDaum - 06 Apr 2010

What you propose would probably work, though it's a hack worthy of (forgive me) hwsnbn.

-- CrawfordCurrie - 09 Apr 2010

Alternatively create directive %TMPL:EOF%. Anything after it will be ignored.

-- ArthurClemens - 09 Apr 2010

Excellent idea. Implemented.

-- CrawfordCurrie - 06 May 2010

More cruft instead frown, sad smile

-- MichaelDaum - 06 May 2010

What is the impact on %TMPL:EOF% in extension templates when an extension is used in 1.0.9? Is such a tag ignorred or will we forever see extentions that cannot run in 1.0 context just because of this change?

Why is a FEATURE like this implemented

  • After the feature freeze
  • Without feature proposal topic
  • Without feature proposal accepted

??

Flipping this back to new.

-- KennethLavrsen - 08 May 2010

I fixed it the way I did because the hack that Michael suggested is just that - a hack, of the type that we constantly fight with in the core. I thought the TMPL:EOF approach was much cleaner and more consistent, and could be cleanly and consistently unit tested. But if the process mavens feel it's a feature and not a fix, then I have no problem with it being reverted and re-done as a hack, as long as the hack is tested.

Note that the hack also has to work for all the other XMLish content types as well.

-- CrawfordCurrie - 08 May 2010

True, people won't be able to use TMPL:EOF for a very long time. But that also holds for almost any new feature where no backwards compatibility plugin to augment legacy engines are at hand. For now it is mostly PatternSkin that will profit from TMPL:EOF, and this is shipped with any new foswiki release. One and the same PatternSkin version doesn't require to run on different foswiki egines. So this argument is not as strong as it seems, Kenneth.

Anyway.

But, why is the proposed patch a hack? Sorry don't buy/understand it... a discussion that normally would have fleshed out understanding for the best solution in a proper feature proposal.

For now, I'd think that TMPL:EOF would require a proper feature proposal, but my above patch not, as it fixes a real shortcoming of all foswikis and tmwikis renderers.

-- MichaelDaum - 08 May 2010

Crawford - It'd help if you answered Kenneth's questions in detail

I think the answers are :

  1. I recon it'll be ignored like other undefined TML vars
  2. after feature freeze - its a bug fix that we should really have thought of years ago
  3. yes - please make a proposal topic!
  4. I'd support a vote, and would vote in favour

and lastly - what Crawford alludes to is a very important point. Micha's hack totally forgets that our template system is used for more than just html output, and cannot ever be capable of dealing correctly with future outputs - so its not anything other than a poor hack.

(I've used the tmpl system for all sorts of native outputs - the most 'normal' being json and one of the least normal being output to toss at a mainframe. and EOF marker would have been fantastic.)

-- SvenDowideit - 08 May 2010

Strange. It is pertinent for contenttype=text/html only. Why is it a problem for other content types?

-- MichaelDaum - 08 May 2010

Extra trailing whitespace is an issue for all structured content types - text/html is actually one of the more forgiving ones

-- SvenDowideit - 08 May 2010

Whitespace is a problem when the renderer inserts markup, like empty paragraphs tags. How is whitespace with json and xml rendered?

-- ArthurClemens - 08 May 2010

Well any empty paragraph plain breaks parsing json. While it is out of specs for rss and atom, feed parsers can cope with it. Most other validating xml parsers will bail out. That is: better don't show the tml renderer a complete page producing these kinds of content types.

Other than that empty lines and other whitespace don't harm html at all. It only becomes a problem when the tml render converts them to empty paragraphs. In that sense Arthur is spot on, and that's the root cause of the problem.

So why can't we suppress generating p's under these circumstances?

-- MichaelDaum - 08 May 2010

First - we need Crawford to investigate what the consequences are for the TMPL:EOF in extension templates when an extension is used in 1.0.X. If that breaks anything then the MINIMUM is to add a big fat warning AGAINST USING THIS IN EXTENSIONS the next 1-2 years. This warning must be with the documentation of TMPL:EOF.

Second. If an extension author adds does not use TMPL:EOF and by mistake has one of the \n at the end of a template he still has a problem.

I cannot see that it is such a hack to remove paragraphs after the terminating /html tag.

Third and last. Please create this feature proposal so we can community vote if we should allow this particular spec change to get a future cleaner termination of templates. I have no personal concern with this in itself. I do have concern about the haste it is introduced with. It is clear that not all consequences have been thought through.

-- KennethLavrsen - 09 May 2010

As I and Crawford pointed out, there is no terminating html tag in several of the formats used for years - rss, xml, txt, json.... so we can't just assume we're only doing html.

-- SvenDowideit - 11 May 2010

The point about TMPL:EOF breaking forward compatibility (not backward compatibility) is a fair one. Older systems will not support %TMPL:EOF%. Frankly I don't think this is a particularly important point, but that's just my opinion.

There is a compromise solution, which I just thought of, and that's to configure the set of content types that do not like whitespace around the output. For example,
# **PERL**
$Foswiki::cfg{ContentTypes}{TrimWhitespace} = {'text/html' => 'both', 'text/rss' => 'end', text/plain => 'none'};
That would avert the hard-coding of the content type and allow end-user configuration. However I remain uncomfortable with this approach, because there are corner cases where it will fail - for example, where templates are used to generate text/html for a command-line script, but that HTML is generated in two phases. TMPL:EOF is the general solution, but if you insist it's a feature and not a bugfix, so be it.

Michael, we should be able to conditionally suppress TML (not just p's, the whole of TML).

-- CrawfordCurrie - 13 May 2010

We have TWikiCompatibilityPlugin. Is a ForwardCompatibilityPlugin a terrible idea? It would implement Foswiki::Func::addToZone() (map to addToHEAD ), and %TMP:EOF%.

-- PaulHarvey - 13 May 2010

I tried this once before, in (tm)wiki days; it was a nightmare to maintain.

Against my better judgement, I reverted %TMPL:EOF and implemented the hack. We will live to regret this.

-- CrawfordCurrie - 13 May 2010

especially as it doesn't seem to work.

-- SvenDowideit - 14 May 2010

Tweaked it as suggested by Babar. Sven, what did you think was wrong? It works for me...

-- CrawfordCurrie - 14 May 2010

the HTMLValidationTests suggest that its not working.

(when there's a failure, I get those tests to output html files too..)

for eg:
---------------------------
HTMLValidationTests::verify_switchboard_function_rdiff_plain

Expected:''
 But got:'HTMLValidation_rdiff_plain (61:8) Warning: content occurs after end of body
HTMLValidation_rdiff_plain (61:8) Warning: content occurs after end of body
HTMLValidation_rdiff_plain (61:8) Warning: content occurs after end of body
HTMLValidation_rdiff_plain (61:8) Warning: content occurs after end of body
HTMLValidation_rdiff_plain (61:8) Warning: content occurs after end of body
HTMLValidation_rdiff_plain (75:1) Warning: <table> lacks "summary" attribute
HTMLValidation_rdiff_plain (66:1) Warning: trimming empty <h1>'
 at /home/sven/src/foswiki/trunk/core/lib/Unit/TestCase.pm line 218
   Unit::TestCase::assert_equals('HTMLValidationTests=HASH(0x1f3ace60)', '', 'HTMLValidation_rdiff_plain (61:8) Warning: content occurs aft...') called at /home/sven/src/foswiki/trunk/core/test/unit/HTMLValidationTests.pm line 174
   HTMLValidationTests::verify_switchboard_function('HTMLValidationTests=HASH(0x1f3ace60)') called at (eval 831400) line 4
   Unit::TestCase::__ANON__('HTMLValidationTests=HASH(0x1f3ace60)') called at /home/sven/src/foswiki/trunk/core/lib/Unit/TestRunner.pm line 310
   Unit::TestRunner::__ANON__() called at /home/sven/src/foswiki/trunk/core/lib/CPAN/lib/Error.pm line 379
   eval {...} called at /home/sven/src/foswiki/trunk/core/lib/CPAN/lib/Error.pm line 371
   Error::subs::try('CODE(0x7fb7cb7a8a80)', 'HASH(0x1fefb168)') called at /home/sven/src/foswiki/trunk/core/lib/Unit/TestRunner.pm line 329
   Unit::TestRunner::runOne('HTMLValidationTests=HASH(0x1f3ace60)', 'HTMLValidationTests', undef) called at /home/sven/src/foswiki/trunk/core/lib/Unit/TestRunner.pm line 122
   Unit::TestRunner::start('Unit::TestRunner=HASH(0x1616268)', 'F') called

-----
and the html file ends with...
</body></html>
<p />
<p />
<p />
<p />
<p />

-- SvenDowideit - 14 May 2010

There is another solution to this issue - stop using the really shite non-named sections at all.

replace it with a named TMPL:DEF such as:

%TMPL:DEF{foswikiresponse}%%TMPL:P{"htmldoctype"}%%TMPL:P{"head"}%%TMPL:P{"bodystart"}%%TMPL:P{"main"}%%TMPL:P{"bodyend"}%%TMPL:END%

and tada, no new syntax, no weird hash configurations, full control of the rendering output, and you can pretend to have backwards compatibility by still defining the older mess.

(I would add to this some detection of non-Sectional TMPL bits to help admins/users/devs transition, and some kind of comment marker (essentialy a version) to allow them to mark up a compatibility non-section)

-- SvenDowideit - 16 May 2010

Sure, good idea.

The more i think about it, the more inclined I am to replace the current hack with an assert:
@@ -772,13 +773,14 @@
     # Remove <nop> and <noautolink> tags
     $text =~ s/([\t ]?)[ \t]*<\/?(nop|noautolink)\/?>/$1/gis;
 
-    # Trim whitespace from the start and end of selected content types
-    if ( $contentType =~ m#text/html# ) {
-
-        # Use \s to match unicode whitespace, in anticipation of the day when
-        # we unicode everything.
-        $text =~ s#^\s+##;
-        $text =~ s#(</html>).*$#$1#is;
+    # Check that the templates specified clean HTML
+    if (DEBUG
+          && $contentType =~ m#text/html#
+            && $text =~ m#</html>(.*?\S.*)$#s) {
+        ASSERT(0, <<BOGUS );
+Junk after </html>: $1. Templates may be bogus
+(check for excess blank lines at ends of .tmpl files)
+BOGUS
     }
 
     $this->generateHTTPHeaders( $pageType, $contentType, $text, $cachedPage );
Anyone want to scream before I make this change?

-- CrawfordCurrie - 21 May 2010

I'm not against the warning, but does it help a skin hacker identify how to fix his .tmpl files? Well, it probably doesn't have to.

For what it's worth, I really do like the %TMPL:DEF{"foswikiresponse"}% idea, assuming it's not a giant change for 1.1 scope.

-- PaulHarvey - 21 May 2010

Extending the hack: can't we write the name of the template?

-- ArthurClemens - 21 May 2010

Paul, it helps because it tells the template author there's something wrong - assuming they have enabled DEBUG, that is. No, it doesn't say where it's wrong - nothing can, by the time we can detect the error case it's far too late.

Not sure what you mean there, Arthur.

The hack is bad news - I have been unhappy with it from the time it was proposed - because it is slapdash. It is intended to deal with this common case:
</html>
<p/>
<p/>
which we suffer when newlines are left at the end of templates, and TML expands them to P's. However, there are other template errors that a user could inflict on themselves, and even valid HTML syntax which will bork them:
</html>
Whoops I really meant this to be inside the HTML but now the hack will chomp it :-(
and
<!-- </html> No, Mr. Bond, I expect you to _die_ -->
</html>
The problem is that the hack is trying to shut the stable door after the horse has bolted, and we've been fighting aginst ill-considered "fixes" like that in the core code for far too long.

Sven's proposal - instantiating a "root template" - is probably what should have been done since day one, but it doesn't help all the existing skins out there that don't define this root template.

-- CrawfordCurrie - 22 May 2010

I believe you're right that the hack can't stay. I think the warning is a good idea - although I do wonder what percentage of skin hackers out there ever run with DEBUG, so it should be mentioned prominently in the SkinTemplates docs.

Sven's root template proposal seems to be the correct path forward - at first glance, it seems both strategies can probably co-exist?

-- PaulHarvey - 22 May 2010

Absolutely. And with the recently coded TRACE mode in templates, it has all become a lot more debuggable.

-- CrawfordCurrie - 23 May 2010

OK, thanks for the feedback. Closed.

-- CrawfordCurrie - 23 May 2010

 
Topic revision: r43 - 04 Oct 2010, KennethLavrsen
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy