Item1917: Decouple store from Foswiki

pencil
Priority: Enhancement
Current State: Closed
Released In: 1.1.0
Target Release: minor
Applies To: Engine
Component:
Branches:
Reported By: CrawfordCurrie
Waiting For:
Last Change By: KennethLavrsen
Sounds drastic, doesn't it?

Well, it isn't. The goal is to remove references to the session from the Store. This increases the encapsulation of the store, and also releases the session from having to worry about how it is implemented. There is one outstanding problem; because some query algorithms require knowledge of the session, it has to be passed down to them. Next step in the refactor has to be to encapsulate these algorithms in objects, so that dependency can also be eliminated.

Note that this refactor means the NativeSearch won't work, but from reading the code i believe that's already the case for trunk.

-- CrawfordCurrie - 11 Aug 2009

I am working towards getting the algo's into registered query 'type' or 'scope' handlers - so that the old search algo would register for 'regex' and 'word' etc

yes, native search has fallen behind - though i think it'll be easier once we're closer to the registration

commiting some cleanups, while I mull over the impact these changes has on my search and db work.

-- SvenDowideit - 12 Aug 2009

I asked Sven some Qs on IRC, which he answered and asked mor on email. So the discussion doesn't get lost, here are my Qs and his As and his Qs and my As:

> [23:27:19] SvenDowideit: what was your intent behind defining the
> 'query(...)' method in the search algorithms?
> [23:27:42] were you just trying to normalise the calling
> interface between query searches and text searches?
> [23:28:10] if so, why did you leave the core calling the ::search
> method? Just "work in progress"?
> [23:29:42] the reason I ask is I just eliminated ->{session} from
> Store.pm and Store/*.pm; but it still has to be passed down to the
> search algorithm
>
> interesting - i'm reading through the diff atm, will have to do more reading
>
> did you benchmark these changes?

No. I looked through the code and realised that the session was only actually used in a limited number of places:

  1. The search and query algos, where it was masquerading as a store object
  2. When auto-attaching pub files
  3. For logging (in Store, not passed to the VC:: classes)
  4. When converting the default user login to a cUID (for default revision info on a non-existing topic)

None of these sites is performance-critical. I added a method to Foswiki::Meta to eliminate (2), passed in the logger for (3). The changes to BaseUserMapping were simply to expose the default user login for (4).

> yes, I was normalising the calling interface
> ::search - i left it as a legacy support, as there were(ok, i think
> are) some existing uses that need preserving or at best deprecating.
> query is the new API for all search algo types - the differentiation you
> made based on directory is too limiting - I want to be able to register
> type=tag or type=attachment

OK, I read a bit more code after my questions and concluded that. I also concluded it was premature of me to push knowledge of that interface any higher; though that is now the main unknown in the Store interface that make plugging in a new store a bit.... questionable.

> ****oh, did you write the SEARCH unit tests against foswiki 1.0? as
> trunk's SEARCH has bugs - last time I had to re-write quite a few unit
> tests when i realised that they were written against trunk, so didn't
> show what I/we thought they did :/
The new ones? No, against trunk. > good thing i havn't been working on porting my multiple stores to trunk,
> seeing as you keep changing it (whining, but in a positive way) - I had
> a hard enough time just working out what the VC stuff's was about (but
> then, you never really groked the design TJ made, so its fair smile )

I understand his design, I just never liked it. For me, it failed to encapsulate embedded meta-data, which is an ultimate sin in my mind, and something that has caused me endless grief over the years.

> ----++ StaticMethod query( $query, $web, $inputTopicSet, $store, $options ) -> $infoCache
> +---++ StaticMethod query( $query, $web, $inputTopicSet, $session, $options ) -> $infoCache
>
> it'll take a while for me to work out if this is inline with my desires smile

Sure - I intended "$session" to be a placeholder for "whatever the search algo needs - the Store won't make any assumptions about what it is".

> does
>
> our @ISA = ('Foswiki::Store::VC::Store');
>
> -# This constructor is required to hide the =Foswiki::Store::VC::RcsLiteHandler=
> -# class during construction, so that the core can simply say "give me a new
> -# XXX store" without having to pass additional parameters.
> -sub new {
> - my ( $class, $session ) = @_;
> - return $class->SUPER::new( $session, 'Foswiki::Store::VC::RcsLiteHandler' );
> +use Foswiki::Store::VC::RcsLiteHandler ();
> +
> +sub getHandler {
> + my ( $this, $web, $topic, $attachment ) = @_;
> + return new Foswiki::Store::VC::RcsLiteHandler( $web, $topic, $attachment );
>
> mean that the store looks and works like the mapping code? if so, maybe we should make it identical

Erm, depends on what you mean. I made that change so I could SELECTCLASS Foswiki::Store::* in configure. By eliminating the {IMPL} from VC::Store it made the external interface to that class much simpler too.

> but this brings up the trouble with the query change - if we have more than one store impl active at one time, we'll need different query algo's depending on what is optimal for a particular store.

Agreed. "Foswiki major" (the stuff outside the store) knows nothing about the query/search algo. The algo is only known about in 2 places - configure, and in the lowest levels of the store impl, where it is used. So a store can treat the "advice" about {Store}{SearchAlgorithm} and {Store}{QueryAlgorithm} with the same disdain as {RCS}{...} - it can choose not to use it.

> ala the TagsStore, tuid's will be useful, and will hurt more people's brains.
>
> mmm
>
>
> sub getRevisionInfo {
> my ($this) = @_;
> -
> + require Foswiki::Users::BaseUserMapping;
> }
>
> feels like an un-improvement - now the other wise independant store needs a mapping?

No; all it needs is $Foswiki::Users::BaseUserMapping::DEFAULT_USER_CUID. Unfortunately you have to require the class to get access to this (I tried it without, and the unit tests failed).

It might be better to abstract $Foswiki::Users::BaseUserMapping::DEFAULT_USER_CUID out of Foswiki::Users::BaseUserMapping. Either a Foswiki::Users::Constants class, or even pushing it to %Foswiki::cfg

>
> - my ( $this, $searchString, $web, $inputTopicSet, $store, $options ) = @_;
> + my ( $this, $searchString, $web, $inputTopicSet, $session, $options ) = @_;
>
>
> huh? i thought you removed Session, instead you've replaced knowing the store ($this) with session?

No, as I said above, $session is a placeholder for whatever you need to pass in to the algo. The store doesn't know, and doesn't care, what $session actually is.

> as i seem to be getting a little more time now (though in small chunks) I think you and I need to make a decision (one that i've been trying not to make):
>
> should we limit each foswiki instance to using one store only - thus simplifying the query type and scope registration, or continue with my hope to have more than one store - thus requiring a store-web register and a query-store-web-type-crap-crap-crap mapping

That's a philosophical question I find hard to answer. I can imagine scenarios in which multiple store impls are important. My prime example is: personal wiki on a laptop, where I "import" some webs from the corporate wiki. My local webs are all stored in my local MySQL, but the "corporate" webs are read-only cached in local files and synched when I reconnect to the corporate network i.e. use a different store impl. So I want to mix store impls, at least at the "root webs" level - I don't think we need to mix at topic, attachment or subweb level, however.

> http://svn.foswiki.org/trunk/TagsPlugin/lib/Foswiki/Store/TagsStore.pm
>
> shows how I hacked around it in twiki/foswiki1.0, and that plugin incidentally contains one of the initial dbschema's I was using for a fast scalable dbstore - though in another I added a topic table that contained the META::TOPICINFO, and had the history for each topic in a table containing the txt serialisation

OK, I'll take a butchers. I have a rough sketch of a DBI store up-and-running, but it doesn't do versioning yet, and I'm still debating how to do attachments. Storing large binary data in the DB always feels like overkill to me, so I'm not sure yet how to do attachments.

I really want a simple DBI store framework in place for 1.1, to afford scope for all the optimisations - indexes, caches etc. - that other people with move DB experience are better suited to address. So I'm aiming at a really simple schema that basically maps meta-data + text 1:1 to the schema (you may have noticed my recent change to filter "non-compliant" meta-data from the Meta object during load) and does a simple mapping of the Query class to SQL.

-- CrawfordCurrie - 12 Aug 2009

Added XML generation to further experimentation with dbxml.

-- CrawfordCurrie - 29 Nov 2009

Do you sucessfully work with Sleepycat's/Oracle's dbxml under perl? Last time I worked with dbxml 2.4.13 I had massive deadlock issue when I simultaneously accessed the db (with or without transaction support). According to this article (scroll down to the second last) there is a bug in the perl binding, which has never been fixed.

-- OliverKrueger - 29 Nov 2009

I hadn't heard about that - many thanks for the heads up. I haven't encountered issues yet, though that may be because I haven't tried to make multiple parallel accesses.

-- CrawfordCurrie - 30 Nov 2009

Crawford, please let's discuss the new xml schema in Development before casting the current one into stone. Remember, only a nice schema makes nice xpath queries ... as far as they could be called so at all.

-- MichaelDaum - 30 Nov 2009

OK, well, regard my checkin as a "first pass proposal" then. I'm well aware of the shortcomings of the %META-implied schema, and am happy to discuss better.

-- CrawfordCurrie - 30 Nov 2009

Sure thing. Code rulez.

-- MichaelDaum - 30 Nov 2009

Removed the code from the core again, as it sits more happily in a plugin at this stage.

-- CrawfordCurrie - 16 Dec 2009

OK, so far as it goes, I think this is releaseable (subject to final testing, of course).

-- CrawfordCurrie - 23 Jun 2010
Topic revision: r24 - 04 Oct 2010, KennethLavrsen
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy