Feature Proposal: Deprecate Contextless URL Constructs

Motivation

See DeprecateContextlessURLConstructs?rev=14 for the original proposal by SvenDowideit in 2008. Re-worked by PaulHarvey with the aim of leveraging the Foswiki::Address work.

Constructing Url's using PUBURL and SCRIPTURL etc reduces the parseablility and information of the intended effect. It reduces the Engine's ability to just work - renames, cut&pastes and other things require understanding of technical details.

Further, constructing URLs this way tends to lock the physical and logical hierarchies together (the "pub effect")

Description and Documentation

Instead, we can increase the information content by abstracting the URI generation to be more REST like, and to focus on doing what the user wants.

As a side effect, topic renames have a better chance of applying to any URL-constructed references to attachments, and we get a more internally consistent way to producing urls. (and simplifies CDN and distributed server linking smile )

Examples

From Foswiki::Address:

Unambiguous strings

To build less ambiguous address strings, use the following conventions:

  • Terminate web addresses with '/'
  • Separate subwebs in the web path with '/'
  • Separate topic from web path with '.'
  • Separate file attachments from topics with '/'

Examples:

  • Web/SubWeb/, Web/
  • Web/SubWeb.Topic
  • Web.Topic/Attachment.pdf
  • Web/SubWeb.Topic/Attachment.pdf

Many strings commonly used in Foswiki will always be ambiguous (such as Foo, Foo/Bar, Foo/Bar/Cat, Foo.Bar.Cat). Supplying an isA specification will prevent the parser from using the (somewhat expensive) exist hinting heuristics.

Impact

%WHATDOESITAFFECT%
edit

Implementation

  • Need Foswiki::Address to be able to render URLs of an internal address. This might incorporate some sort of listener pattern, so that WebDAVContrib or EnableCloudStorageForAttachments could intercept & render the URL differently. Or delegate the responsibility of rendering Foswiki::Address objects to some other class?
  • Proof of concept will be implemented in SemanticLinksPlugin (and we can test performance impact). There will be performance impact, especially if exist-hinting is enabled.

References

-- Contributors: SvenDowideit - 16 Nov 2008, PaulHarvey 13 Jun 2011

Discussion

DAMN BLOODY RIGHT

The voice of experience here. I have been working on a Filesys::Virtual::Foswiki implementation for WebDAV. This exposes the Foswiki store as a "traditional" UNIX filesystem. I tried various approaches to structuring the data in a filesystem type way, and ended up with the following:
   /web/
      /subweb/
         /topic.txt
         /topic_files/
            /Somefile.xls
I found that the .txt and _files suffixes were essential to be able to distinguish between the three cases of 'topic' - the topic text, the attachments, and a subweb called 'topic'. With this scheme in place you can visit and manipulate a TWiki store through WebDAV. ( I selected _files because that's where most browsers save files when you "Save Page As". .files may be a better choice, because '.' is conveniently already excluded from web and topic names, whereas '_' isn't).

This gives a clean, consistent scheme without any tortured syntax:
  • %URL{"/Web/Topic_files/Somefile.xls" action="manage"}%
  • %URL{"/Web/Topic.txt"}%
  • %URL{"/Web/Topic.txt" action="reparent"}% - reparent a topic
  • %URL{"/Web/Topic" action="reparent"}% - reparent a web called Topic
I'm not wedded to the .txt and =_files+ suffixes. Indeed, when we extend this to a context access syntax, .txt doesn't "feel" quite right:
  • %VALUE{"/Web/Subweb/Topic.txt:parent.name"}%
Another suffix might make more sense, e.g.:
  • %VALUE{"/Web/Subweb/Topic.content:parent.name"}%
Gets a bit wordy though:
  • %VALUE{"/Web/Subweb/Topic.content:META:PREFERENCE[name='FaveColour'].value"}%

-- CrawfordCurrie - 16 Nov 2008 - 07:29

ok, initial thoughts - I think I'd like to mull over it a little, as instinctively i don't like your proposed syntax.

First up, the dot is used in TWiki to separate web and topic - so not using it here only confuses the main constituents. using / isn't better either. What I anted to do was create a consistent abstraction that can allow users to not think about URL's - something a lot of them don't understand.

.txt is terrible - as its even more inconsistency, but this time as a crutch to make it easier for the parser - when the point of the syntax is to make it easier for the user - who rightly expects TWiki to know what they meant.
  • interesting aside though - if you use .txt to denote encoding, you could use .jsan and .xml etc - not quite REST - and perhaps we should remember that we contemplated thinking of attachments as another type of topic...

So for my part, when extending the syntax to topic contents, it _must be the same syntax as in IF and query SEARCH.

give me a day though, i might flipflop :/

-- SvenDowideit - 16 Nov 2008 - 09:41

My suggestions to incorporate url params and anchors...

Anchors:
%URL{"Web.Topic" anchor="Anchor"}% - set anchor position
%URL{"Web.Topic#Anchor"}% - shorthand

URL params:
%URL{"Web.Topic" params="sort=Author;limit=20"}% - set params =sort= and =limit=
%URL{"Web.Topic" setparams="sort=Author;limit=20"}% - identical
%URL{"Web.Topic?sort=Author;limit=20"}% - identical
%URL{"Web.Topic" params="%QUERYSTRING%"}% - copy the current query string
%URL{"Web.Topic" params="%QUERYSTRING%" setparams="sort=Date"}% - changes one parameter in the existing query string
%URL{"Web.Topic" params="%QUERYSTRING%" setparams="sort=Date;limit=10"}% - changes two parameters in the existing query string
%URL{"Web.Topic" params="%QUERYSTRING%" setparams="sort="}% - voids a parameter
%URL{"Web.Topic" params=""}% - voids all parameters

Link label:
%URL{"Web.Topic" label="form.TopicTitle.value"}% - creates a link label from a form field value
%URL{"Web.Topic" label="form.TopicTitle"}% - identical, shorthand
%URL{"Web.Topic" label="TopicTitle"}% - literal "TopicTitle" as label text

-- ArthurClemens - 16 Nov 2008 - 15:08

I'm not a great fan of .txt either, but it does solve a massive problem. The rationale is not just to make it easier for the parser. There has to be a way to express the difference between a web and a topic.

I agree that the the ideal would be a syntax that is consistent with IF and query search. I'm not sure that's possible, though. The trouble is that that syntax is context-dependent (it has to find out if the leaf is a web or a topic). The leads to ambiguities all over the shop. An in-your-face example is:
   %IF{"'Webname.Blah' allows 'view'"....
where Blah exists as a subweb and a topic (at the moment, the topic wins, making the web invisible :-().

One approach is to introduce the concept of a "disambiguation operator" - in the way %X and $X and @X are different variables in perl. e.g. you could write %IF{"'Webname.Blah/' allows 'view'".... do indicate that you mean the web "Webname.Blah" rather than the topic (well, you could if we didn't already have a / operator). This is in fact exactly what I did in Filesys::Virtual, except I used the ".txt" postfix operator to indicate a topic, and the "_files" postfix operator to indicate the collection of attachments.

When we associate more meta-data with collections (and we will) this can only get worse.

-- CrawfordCurrie - 16 Nov 2008 - 16:48

How about / between webs and subwebs, . before topic name (so the common case of Web.Topic matches what most users will expect), and a trailing : separator to designate properties? A selector could be used to specify different kinds of properties.
Web.Topic
Web/Subweb
Web/Subweb.Topic
Web/Subweb.Topic:ATTACHMENT:logo.png
Web/Subweb.Topic:FIELD:field_name
Web/Subweb.Topic:field_name           shorthand for lower/mixed case field names

-- IsaacLin - 16 Nov 2008 - 18:05

I like the original proposal using : to address attachments. Making it another slash introduces ambiguity with sub(sub)webs. Encoding even more semantics into dot versus slash as you Isaac proposed will most probably make things worse as both are used exchangeable in all contexts (plugins etc). Displaying a full qualified WikiWord like Web/SubWeb.WikiWord does not look good either. I'd prefer to make it all dots within (sub)webs and a colon to address an attachment to that topic.

Wrt %URL{}% , I like it very much. Comes in handy. But that's mostly it.

However, its potential to ease rewriting WikiWords (during topic/web rename) is most probably only temporary, maybe even shortsighted. Renaming WikiWords needs a complete redesign based on indexing WikiWord occurrences automatically and thus eliminating the need to do a over-expensive search&replace operation. That does not scale at all.

Besides, a WikiWord occurrence index would server a lot of other purposes as well, e.g. quick backlinks, ordering search results by importance.

Apropos backlinks: that's implemented separately from the search used in rename, returning quite different search results. There's lots of bugs in renaming anyway, e.g. corrupting wiki apps when WikiWords in formfields get renamed and so on.

So justifying anything because it would help rename is probably no solid foundation.

-- MichaelDaum - 17 Nov 2008 - 07:39

Isaac is just trying to address the problem I described, which the colon just doesn't help with, of disambiguating between a subweb and a topic with the same name. note that in the original design of hierarchical subwebs this problem was never foreseen (see http://twiki.org/cgi-bin/view/Codev/HierarchicallyNestedTwikiWebs) and the eventual choice was down to "preference". Indeed, the choice was never consciously made and in many places '/' and '.' are interchangeable.

I don't see any mention of rewriting wikiwords in the original proposal - I assume you meant during topic/web rename?

I think %URL is a lot more than just "handy", it's really important in decoupling logical and physical hierarchies.

-- CrawfordCurrie - 17 Nov 2008 - 08:31

I see unfinished discussion on the syntax but sense agreement on the principle.

We can deprecate the PUBURL and SCRIPTURL but in reality we can NEVER remove them. They are used too much in real life. But we can strongly recommend the new syntax in documentation and for sure the original syntax suggested by Sven is easier for the user to use and remember.

I put Crawford as concern raiser to stop the 14 day clock but once you agree on syntax I assume this can pass with concensus and no need for voting.

The target release would be 1.1.

-- KennethLavrsen - 03 Dec 2008 - 08:13

We're fighting inertia again. The bottom line is: it is impossible to disambiguate a topic and a subweb of the same name using the '.' syntax. So far there have been a couple of proposals for disambiguating them, but no-one is particularly excited by any of them. Some IMHO observations:
  • Inertia means users are used to Web.Topic. However they are not so accustomed to Web.Subweb.Topic
  • We should aim to be consistent with the search query language, if possible.
  • The '/' operator is already defined in the query language, as a sort of perl ->.
  • So is the '.' operator, as a sort of C field accessor.
If we think of topics as fields of webs, then it becomes natural to write:
  • Web.Subweb.Subsubweb.Topic.size
This is an unambiguous reference to a topic meta-data (size). The problem comes where there are meta-data in webs. Say we have a subweb called Web.Subweb.Subsubweb.Topic, and it has a meta-attribute size. What does the field specifier refer to now? The web meta-data, or the topic meta-data?

We can have a rule that says "if there is a reference that could be ambiguous between a topic and a web, the topic always wins" but we then face the problem of referencing the web meta-data. If we go back to the idea of the disambiguation operator, I think we can keep the current query syntax intact. Let's say I have the disambiguation operator '~' that means 'the following word always refers to a web. We would then write:
  • Web.Subweb.Subsubweb.Topic.size to get at the topic size and
  • Web.Subweb.Subsubweb.~Topic.size to get at the web size.
OK, back to %URL. Since attachments is a field of a topic, using the current query syntax:
  • %URL{"Web.Subweb.Subsubweb.Topic.attachments[name='Somefile.xls']"}%
This is unambiguous, because if we meant the web 'Topic' we would have written '~Topic'. Note that all operators still mean what they have always meant in the query language. =.attachments[name='Something.xls'] is rather clunky. If you introduced a shorthand operator to the query language ';' meaning 'attachment called' you could simplify that to:
  • %URL{"Web.Subweb.Subsubweb.Topic;Somefile.xls"}%
URL just became shorthand for something like this (current syntax):
  • %SEARCH{"Web.Subweb.Topic.attachments[name='Somefile.xls']" type="query" nonoise="on" format="$percntPUBURL$percnt/$web/$topic/Somefile.xls"}%
Damn, there is no way in the SEARCH format specifier to refer to "the result of the query that matched this topic". If there was, you could:
  • %SEARCH{"Web.Subweb.Topic.attachments[name='Somefile.xls'].size" type="query" nonoise="on" format="$value"}%
from an XhrHttpRequest.

Or, we could just do things The TWiki® Way:
  • %URL{web="Web.Subweb.SubSubweb" topic="Topic" attachment="Somefile.xls"}%
This might ultimately be the most expedient, but it doesn't get us any closer to a content access syntax frown, sad smile

-- CrawfordCurrie - 03 Dec 2008 - 08:29

parking. my simple idea got so completely bike shedded that i found other things to do. Hopefully someone will find the discussion useful at a later date.

-- SvenDowideit - 06 Mar 2010

http://trunk.foswiki.org/System/PerlDoc?module=Foswiki::Address correctly parses Web/Topic/Attachment paths, and even has "does it exist as a web/topic/attachment" heuristics so that it does what you mean, 99% of the time. It also suports the @rev notation suggested in LoadDifferentTopicVersions.

I'll have to add it to SemanticLinksPlugin so we can test it in practice. This should allow backlinks to attachments to work quite well (SemanticLinksPlugin already helps backlinking by caching a topic link into a META:LINK datum).

On the other hand, it does bother me that a link might resolve to different things as webs/topics pop in and out of existence (when exist-hinting heuristics are enabled) - this becomes very important for context-sensitive (i.e. not fully-qualified) links which omit the web part (or even topic part, if you're linking to an attachment on the topic where it's written, Eg. [[Attachment.pdf]]).

This proposal is not yet finished. Outstanding issues: This is what users
  • Design/delegation of "URL rendering API"
  • Foswiki::Address only deals with resources, and won't help build an edit vs view link. Do we need to geek up the [[link:edit]] syntax?
    • TODO: how do other platforms do it (review the list of wiki/cms software I researched for the @rev notation in LoadDifferentTopicVersions).
  • Foswiki::Address parser represents my best efforts at recycling the existing topic/attachment path notation. Sadly, in this form, non-fully-qualified paths (>> 90% links) will probably always need "exist hinting" enabled (read: performance overhead), to determine if an attachment or a topic was linked. And this also means that such links may not always point to the same thing (as attachments/topics pop in and out of existence, the hinting heuristics might resolve to a different resource).
    • Is this fuzzyness acceptable, or
    • Is it acceptable to introduce new syntax to solve this problem?
    • IMHO both of these approaches, neither of which are perfect, are nonetheless infinitely better than the %ATTACHURL% and %PUBURL% mess!

-- PaulHarvey - 13 Jun 2011

The only strong feeling that I have, is that I didn't like the idea of a %URL{...}% macro (as per the old proposal). We should be able to make [[bracketed links]] work for us.

-- PaulHarvey - 13 Jun 2011

imo, %URL{}% is the same thing as [[bracketed links]] - the spelling of the thing doesn't matter to me much :). I do prefer to use one 'style' that utilitses named parameters, and =%URL{}% had the advantage that I could leverage the pre-existing Component Edit parser&generator - but converting from one to the other is trivial - so go, go, go.

-- SvenDowideit - 13 Jun 2011

+1 on [[bracketed links]], and I must make the time to read the Foswiki::Address code - I'm really impressed that it can parse an attachment URL in a contextless way. I have a few evil unit tests for it smile

I raised my concern, as this seems to be coming together now.

-- CrawfordCurrie - 13 Jun 2011

Bracket links and %URL{}% are not equivalent and there are use cases where you'd need a %URL and a bracket link wont work. That's because a %URL can have additional parameter whereas bracket links can not. Example: there's a need to get different urls to one and the same attachment:

  1. the current (or some earlier) version as stored in the file system
  2. the same file but stored in the cloud, thus served to the browser via the cloud/CDN and not thru the original foswiki server
  3. a link to the location where to upload a new version
  4. the thumbnail of the attachment
  5. a link to a resized version of an image or processed in any other way
  6. an icon for the mime type of an attachment
  7. a link to the streaming server for that attachment
  8. a link to the webdav server for that attachment, thus being able to open it up in office directly
  9. a link to a preview rendition, e.g. embedding a pdf into a webpage
  10. a link to a cached version of the concatenation of a set of attachments (%URL{"reset.css, grid.css, forms.css, typography.css" cache="on" compress="on" minify="on"}%)

Just a few ideas on "urls" broadening the view a bit inspired by CMIS. Using an %URL would come in handy, whereas using bracket links will only work out for a subset I guess.

Both bracket links as well as %URL are probably directed towards different kinds of users. Bracket links are much more user friendly and are probably mostly used by content authors of wiki content, whereas an %URL notation and the flexibility of it lends more towards wiki admins and developers taking care of the general infrastructure. Not that content authors would find an %URL unacceptable. Some might actually prefer it for one or the other reason.

-- MichaelDaum - 13 Jun 2011

Agreed. I'm still not completely sure what a %URL macro would look like. Would be great if we can make it consistent with %ADDRESSPART macro required for TopicAddressing (the Foswiki::Address proposal), and it'd also be cool if we had a $url() token equivalent in FormattedSearch.

Also, I was hoping there would be a way that we can make an API that would allow [[bracketed]] links to automatically link to "cloud storage" or webdav resource.

-- PaulHarvey - 13 Jun 2011

MichaelDaum pointed out on #foswiki - if %URL is used for dynamic resolution of the best location for an attachment (cloud storage, distributed content, etc), then there is also an interaction issue with the PageCache. It would cache and return the rendered html link, which might not be valid or optimum as attachments are moved into alternate storage locations.

Do %URL's need to be considered "dirty" from a cache perspective? Will changes that impact URL's have to go back in and invalidate cache locations? How does that work with changes external to Foswiki (cloud storage location becomes unreachable). See EnableCloudStorageForAttachments

-- GeorgeClark - 13 Jun 2011

Any cache entry of a page whith an %URL macro (or bracket link) that now expands to something else simply has to be purged from the cache, a simple api call, though an important one at that point of the code flow.

-- MichaelDaum - 14 Jun 2011

Changing to Parked. Needs developer to adopt.

-- GeorgeClark - 19 Nov 2015

 
Topic revision: r22 - 19 Nov 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy