Content Access Syntax
Introduction
TopicObjectModel,
AutomaticLinkLabelBasedOnHeading,
SearchingSectionsOfTheTopics,
NamedIncludeSections,
FormattedTWikiFormDataInTopicText,
ReinventingTWiki
The above links contain much of the background that has driven this topic; the contributors to those topics are hereby acknowledged and thanked. Of especial note is
RaymondLutz in
TopicObjectModel.
All of the above topics discuss or allude to the idea that we should rationalise the way content is stored and accessed, both from a user aspect and from a code aspect.
This topic is focused on the user aspect, and discusses the syntax that might be used to reference meta-data fields. Such a syntax doesn't require big changes to the underlying TWiki engine to be implemented, though
implementation is discussed in ContentAccessImplementation.
Meta-data fields are fields within %META tags in the topic, and information from the version control system. So the following needs to be recovered:
- attachments
- for each attachment
- name
- attr
- comment
- path
- size
- user
- version
- rev
- date
- comment (empty at the moment)
- user
- parent
- topicinfo
- author
- date
- format
- version
- topicmoved
- form
- for each field in form
- name
- value
- version
- rev
- date
- comment (empty at the moment)
- user
- text (raw text of the topic)
- heading
- for each heading in text
- level
- text
- table
- for each table in text
- row
- for each row in table
- cell
- list
- for each bullet list in text
- text
PeterThoeny has suggested that the first heading in a topic is also meta-information, in that it is useful to present as the link to a topic (see
AutomaticLinkLabelBasedOnHeading). This can be added quite cleanly:
Of course, most of these fields can currently be accessed using %TAG style tags, though the syntax and semantics are somewhat variable. There are two good end-user reasons for rationalising these into a single consistent syntax:
- Consistency. A single consistent syntax is easier for users to remember.
- Clarity. %VARIABLE syntax has often been criticised as hard to read; a single consistent syntax can be much easier on the eye.
- Future proofing. As shown by the adding of firstHeading above, new fields can easily be incorporated into a consistent syntax without requiring the constant invention of new syntaxes.
(another good reason is performance, but this topic is about syntax so we won't say any more about that).
Note that none of the following proposes the instant removal of existing syntaxes for field accesses. Instead, their gradual deprecation and phasing out can take place over time.
Proposal 1

Note: Proposal 1 is not an option anymore because of the web hierarchy feature.
Note that the field structure described above is the schema for a simple database - the twiki database. So, a syntax that is reminiscent of standard database access methods is strongly indicated. We already have the roots of this syntax in the Web/Topic relationship; a topic is a field in a web, and Web.Topic is its specifier. The
DBCacheContrib already recognises
and implements this kind of field access, so
the first proposal is that we use the
DBCacheContrib C/C++/Perl/Java based syntax:
-
Web.TopicName.topicmoved.by
-
TopicName.version.rev
-
[[TopicName][TopicName.firstHeading]]
Of course this syntax presents a number of problems to the parser, but that's a technical problem that can always be solved.
An issue with any syntax is the representation of arrays. The
DBCacheContrib solves this problem using square brackets:
-
Web.TopicName.attachments[0]
DBCacheContrib also offers some powerful extension syntax for extracting subfields, such as:
-
Web.TopicName.attachments[*name]
which would return a space-delimited list of attachments. To handle arrays successfully you also need to embed searches, such as:
-
Web.TopicName.attachments[?size>5000][*name]
which would return a space-delimited list of attachments of size > 5000. And
-
Web.TopicName.attachments[?size>5000][?name=~'*.gif'][*name]
would return a space-delimited list of gif files larger than 5K.
All bracketing in
DBCacheContrib is done using square brackets.
While
DBCacheContrib currently flattens out lists using space-separation, it would be interesting to consider how formatting might be introduced to mirror the current "standard" for header= and format= parameters to tags.
Proposal 2
A second proposal makes it easier for the parser and also underlines the difference between a topic and a field within a topic. It also uses the
DBCacheContrib syntax for field accesses. This is a simple modification of the previous syntax to use a : to separate topic and fields.
-
Web.TopicName:topicmoved.by
-
TopicName:version.rev
-
[[TopicName][TopicName:firstHeading]]
-
Web.TopicName.attachments[*name]=[?name=~'*.gif'][*name]
-
this:attachments[0].name
Proposal 3
The third proposal introduces a curly-brace in place of the ., making it even easier for the parser, in a syntax highly reminiscent of perl hashes
-
Web.TopicName{topicmoved}{by}
-
TopicName{version}{rev}
-
[[TopicName][TopicName{firstHeading}]]
-
Web.TopicName{attachments}[?size>5000][?name=~'*.gif'][*name]
-
this{attachments}[0]{name}
This is perhaps the most
technically appealing syntax, because it is easy to parse. From a user perspective, it is also easy to type. The main criticism of it might be that it is not a logical progression from the "." syntax used to access topics in webs, though that could easily be remedied by allowing
Web{TopicName}
as an alternative to
Web.TopicName
.
Proposal 4
The fourth proposal is to retain the %VARIABLE syntax, probably with some re-use of the "." subfield syntax
-
%FIELD{web="Web" topic="TopicName" field="topicmoved.by"}%
-
%FIELD{topic="TopicName" field="version.rev"}%
-
[[TopicName][%FIELD{topic="TopicName" field="firstHeading"}%]]
-
%FIELD{web="Web" topic="TopicName" field="attachments[?size>5000][?name=~'*.gif'][*name]"}%
while this is the most consistent with the existing syntaxes, IMHO it is the least user-friendly. -
CC
Proposal 5
This proposal is similar to proposal 2, with a TOM: (Topic Object Model) prefix. This makes it easier to parse by eye and machine. It is however more typing.
-
TOM:Web.TopicName:topicmoved.by
-
TOM:TopicName:version.rev
-
[[TopicName][TOM:TopicName:firstHeading]]
-
TOM:this:attachments[0].name
Proposal 6
The
{{where:what}}
syntax is easy to parse by eye and machine.
-
{{Web.TopicName:topicmoved.by}}
-
{{TopicName:version.rev}}
-
[[TopicName][{{TopicName:firstHeading}}]]
-
{{this:attachments[0].name}}
Questions to be answered
Here are some questions that need answering:
- Would it be better to go with a standard syntax, such as CDML?
- How do you represent formatting. For example, I want a table of attachment names and sizes. How about this:
-
TopicName{attachments}[*|$name|$size]
(there's got to be something better than that)
- or %FORMATARRAY{"TopicName{attachments}" header="|*Name*|*Size*|" format="|$name|$size|"}%
- or
|*Name*|*Size*|
|Web.TopicName.attachments[*name]|Web.TopicName.attachments[*size]|
- How do we recover revisions? Is there an argument for a full object-oriented approach? PaulineCheung suggests:
-
Web.TopicName::parent::rev{1.2}::wikiusername
- this could be extended by plugins providing new methods...
- How long do we have to keep the old syntax around?
Contributors:
CrawfordCurrie,
PeterThoeny
Another idea is to consider the sections of the topic text either side of INCLUDE statements as separate. The text then consists of an array of text blocks interleaved with references to other topics. This would required the concept of "parameterised references" to support paramterised includes (currently only supported by
MacrosPlugin).
--
CrawfordCurrie - 05 Aug 2004
I would like to get a major feature into
DakarRelease that makes an impact. This one is a good candidate.
--
PeterThoeny - 10 Jun 2005
We need a syntax for
AttachmentRenderings too.
--
MartinCleaver - 18 Jul 2005
I'd suggest that as
DakarRelease's focus is performance, and refactoring to enable future releases, this is highly innapropriate at this late stage in development. We have already locked down development to bug fixes and docco only.
As this is a
very critical feature to get right, and Crawford and I (both of us really want it) never quite see eye to eye on some of the details, i'd suggest that it will take quite a number of months of work (and design) to make it releaseable.
--
SvenDowideit - 19 Jul 2005
I would like to
ImplementIdeaBeforeTheCompetition, but this is now a
CompetitiveNecessity (see
WikiTalk).
--
PeterThoeny - 07 Nov 2005
See also
BringTopicVarsIntoCore.
--
ArthurClemens - 07 Nov 2005
A solid access syntax should be defined and implemented in
EdinburghRelease. This helps avoid introducing many VARIABLES (see
AttachmentCount as one example.)
--
PeterThoeny - 27 Feb 2006
Is that the
TopicObjectModel and/or a
DocumentObjectModel (DOM) ?
--
WillNorris - 27 Feb 2006
Ideally the
TopicObjectModel and the ContentAccessSyntax should be done at the same time. It can be done in phases as well. A logical approach:
- Define and implement an underlying TopicObjectModel
- Define and implement the ContentAccessSyntax
Alternatively, we could turn this around:
- Define the ContentAccessSyntax and implement a subset of it based on the existing architecture (low hanging fruit)
- Define an underlying TopicObjectModel
- Finish the ContentAccessSyntax implementation, this time based on the TopicObjectModel
--
PeterThoeny - 27 Feb 2006
Actually, I think it is wise to keep them separate. The TOM model employed by code has to be compromised to some extend for efficiency, but the user should only see the "pure" form of the content access sytax; it should look consistent and simple to the user, irrespective of what horrors lurk beneath. ;-). The
TopicObjectModel would actually be easier to define if the requirements imposed by the
ContentAccessSyntax were known in advance.
On Will's point; the DOM is indeed extremely similar, though the access syntax to the DOM is somewhat arcane. I would prefer something a lot more user-friendly.
Note that proposal 1 has been blown away by subwebs, which have "stolen" the .
--
CrawfordCurrie - 27 Feb 2006
Picking up on the idea again. See updated proposals.
--
PeterThoeny - 08 Jul 2006
Here is another idea:
ModifyContentAccessSyntax.
--
PeterThoeny - 08 Jul 2006
This is also needed for specifying that templates are held as attachments - see
TemplatePathIsCounterintuitive
--
MartinCleaver - 08 Jul 2006
Food for thought how to approach this: The goal is to strengthen the
TWikiApplication platform by creating a powerful and intuitive server side DOM for
content access (this topic) and
content change (
ModifyContentAccessSyntax). On top of that, add an
ApplicationFramework to create applications easily. TWiki a better PHP? Borrow ideas from
RubyOnRails?
--
PeterThoeny - 16 Nov 2006
We can, of course, go and define all kinds of fancy extensions.
As far as I am concerned, the
DBCacheContrib and
FormQueryPlugin (in their "YetAnother..." incarnation) go a long way of addressing my application needs, except that I would like to have a join query, which I will sooner or later get around to do.
In certain applications I am accessing all the topics that way only.
--
ThomasWeigert - 11 Jan 2007
I'm opening this up again because of recent discusssions relating to REST and indexing.
--
CrawfordCurrie - 09 Mar 2009
Added some discussion in
TopicObjectModel.
--
PaulHarvey - 06 Nov 2009
Discarded, as it's well past it's sell-by date.
--
Main.CrawfordCurrie - 18 Jan 2016 - 18:27