This question about Missing functionality: Answered

Rendering of broken links

In our installation, we create project pages which have a template which creates links to documentation (pdf's for example) that live on our network. These links are actually available through our apache server, so in essence they are sitting on our apache server. for example: http://ourwiki.company.com/v/somedocument.pdf

I would like to figure a way to display some sort of special "broken" icon if that actual pdf file is not existing in the link target as specified. Currently you have to click on the link and then get redirected tot he "link missing" page.

It would be great if there was a way to render the link (prior to clicking it) in a way the showed that the destination was missing. Kind of similar to the way that undefined wikiwords get decorated with "?" automatically.

-- JimParker - 04 Jun 2012

None of the current extensions I could find will provide this feature. It would add a bit of page load time and server load to check every external link, so some sort of server caching of the link status would be most likely required.

The ExternalLinkPlugin - which marks off-site links with a small ICON could probably be extended to do checking like this. But unfortunately nothing off the shelf comes to mind.

-- GeorgeClark - 04 Jun 2012

Ok good idea. I will see of I can hack that plugin I only want to do this checking on only a handful of links on only certain pages so I might be able to deal with performance.

-- JimParker - 05 Jun 2012

Hmmm....I would need some sort of function i could call from Perl that would tell me if the target link is valid or not....if i had that info, i could add code to decorate the external link one way for valid links and another way for invalid links.

Any ideas?

-- JimParker - 05 Jun 2012

The CPAN:LWP library can be used to fetch URL's using perl. Rather than issuing a GET, it would probably be better to use the HEAD operation so that the complete download is avoided.    Foswiki::Net also provides an http read function, but it doesn't expose the HEAD method.

-- GeorgeClark - 08 Jun 2012

Thanks, good tip. I had already implemented this with a simple call to LWP::Simple::head(). It works fine now. I even added a config field where the user can select the icon to display for a good or bad link.

The next challenge i have is that now i want to be able to selectively turn off the call to "head" as it causes slower page loading (as expected!). Its a challenge to me, the novice, because this is not actually a macro extension, but soemthing that gets invoked automatically for all external links. Still debating on approach.

-- JimParker - 11 Jun 2012

One last question: how could i modify this to make it also process links that are directly written into the page without the text format? I am having a tough time decoding the regex that does it now.

-- JimParker - 11 Jun 2012

Matching HTML links can get a bit tricky, There is a regex for finding html tags in the WysiwygPlugin/TML2HTML.pm that might be helpful:

    $text =~ s/(\<a
         (?:\s+
           (?: href|target|title|class )=                 # Supported attribute
           (?: \'[^\']*\' | \"[^\"]*\" | [^\'\"\s]+ )+    # One or more SQ, DQ or space delimited strings
         )+                                               # One or more attributes - href is required
         \s*\>
         .*?                                              # the link text
         \<\/a\s*\>                                       # closing tag
         )/
         $this->_liftOutLink($1)/geixo;

This pulls out the entire <a>...</a. tag for processing by the _liftOutLink subroutine. It's written in the "x" regex format so that the individual components can be documented.

In thinking about this task, note that this plugin currently uses the commonTagsHandler which is probably at the wrong time. %MACROS will not have been completely expanded, so link verification might fail if the links are made up of %MACRO results. You might have to do some of this work in a postRenderingHandler. I'm not sure where the best place to do this would be.

-- GeorgeClark - 12 Jun 2012

Well, it DOES work with $formfield()'s being expanded first. I have a search that results in $formfield extraction and i use that in a table format statement which results in a link generation. The link is properly adorned.

The bigger issue i have is that i wanted a raw http:// link in the page to be checked. But if the raw http:// url isn't bracketed ( text) then the CommonTagsHandler() never gets a crack at the url.

So my regex isn't going to get a chance to work.

-- JimParker - 12 Jun 2012

Ok, new update. I DID get this to work....i WAS getting the info i needed in the commonTagsHandler(). I just needed a special regex that would find the right links i wanted to test against.

-- JimParker - 13 Jun 2012

Is there any way to check all pages for broken links? The idea is to do this as a periodic maintenance task to make sure all external links are still valid.

-- DonaldFast - 24 Jan 2014

Never mind -- I see that general site link checker will work. ( i picked the wrong one and it was failing )

-- DonaldFast - 24 Jan 2014
 

QuestionForm edit

Subject Missing functionality
Extension
Version Foswiki 1.1.4
Status Answered
Related Topics
Topic revision: r11 - 24 Jan 2014, DonaldFast
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy