You are here: Foswiki>Tasks Web>Item13444 (03 Feb 2016, GeorgeClark)Edit Attach

Item13444: Foswiki::Net::getExternalResource does a poor job of parsing URLs

pencil
Priority: Enhancement
Current State: Closed
Released In: 2.1.0
Target Release: minor
Applies To: Engine
Component: FoswikiNet
Branches: Item13444 master Item13897
Reported By: CrawfordCurrie
Waiting For:
Last Change By: GeorgeClark
I happened to notice the twiki guys had modified the URL parsing in getExternalResource. I had a look at what they had done (which is wrong, but I've given up trying to help them) and it struck me how poor our URL parsing is.

This is really lame, especially considering CPAN has URI and URI::URL which do all the heavy lifting for us.

Recommend we add URI and URI::URL to our CPAN dependencies (LWP may already pull them in, I'm not sure).

Marking as Urgent so it gets considered for 1.2.0

-- CrawfordCurrie - 05 Jun 2015

Done, not had time to test, not urgent for beta2

diff --git core/lib/Foswiki.spec core/lib/Foswiki.spec
index 53f0ad9..2b409fa 100644
--- core/lib/Foswiki.spec
+++ core/lib/Foswiki.spec
@@ -932,8 +932,8 @@ $Foswiki::cfg{AccessibleCFG} = [
 # URLs. Only enable it if you are in an environment where a DoS attack is not
 # a high risk.
 #
-# You may also need to configure the proxy settings ({PROXY}{HOST} and
-# {PROXY}{PORT}) if your server is behind a firewall and you allow %INCLUDE of
+# You may also need to configure the proxy setting ({PROXY}{HOST})
+# if your server is behind a firewall and you allow %INCLUDE of
 # external webpages (see Proxies).
 $Foswiki::cfg{INCLUDE}{AllowURLs} = $FALSE;
 
@@ -1046,14 +1046,9 @@ $Foswiki::cfg{AccessibleENV} =
 #              authtype:hostip' **
 # Hostname or address of the proxy server.
 # If your proxy requires authentication, simply put it in the URL, as in:
-# http://username:password@proxy.your.company.
+# http://username:password@proxy.your.company:8080.
 $Foswiki::cfg{PROXY}{HOST} = undef;
 
-# **STRING 30 LABEL="Proxy Port" CHECK='undefok emptyok'**
-# Some environments require outbound HTTP traffic to go through a proxy
-# server. Set the port number here (e.g: 8080).
-$Foswiki::cfg{PROXY}{PORT} = undef;
-
 #---++ Anti-spam
 # Foswiki incorporates some simple anti-spam measures to protect
 # e-mail addresses and control the activities of benign robots, which
diff --git core/lib/Foswiki/Net.pm core/lib/Foswiki/Net.pm
index 691867b..aa76f08 100644
--- core/lib/Foswiki/Net.pm
+++ core/lib/Foswiki/Net.pm
@@ -120,14 +120,11 @@ if (!$response->is_error() && $response->isa('HTTP::Respon
 sub getExternalResource {
     my ( $this, $url, %options ) = @_;
 
-    my $protocol;
-    if ( $url =~ m!^([a-z]+):! ) {
-        $protocol = $1;
-    }
-    else {
-        require Foswiki::Net::HTTPResponse;
-        return new Foswiki::Net::HTTPResponse("Bad URL: $url");
-    }
+    require URI::URL;
+
+    my $uri = URI::URL->new($url);
+    my $proxyHost = $this->{PROXYHOST} || $Foswiki::cfg{PROXY}{HOST};
+    my $puri = $proxyHost ? URI::URL->new($proxyHost) : undef;
 
     # Don't remove $LWPAvailable; it is required to disable LWP when unit
     # testing
@@ -137,16 +134,16 @@ sub getExternalResource {
         $LWPAvailable = ($@) ? 0 : 1;
     }
     if ($LWPAvailable) {
-        return _GETUsingLWP( $this, $url, %options );
+        return _GETUsingLWP( $this, $uri, $puri, %options );
     }
 
     # Fallback mechanism
-    if ( $protocol ne 'http' ) {
-        if ( $protocol eq 'https' && !defined $SSLAvailable ) {
+    if ( $uri->scheme() ne 'http' ) {
+        if ( $uri->scheme() eq 'https' && !defined $SSLAvailable ) {
             eval 'require IO::Socket::SSL';
             $SSLAvailable = $@ ? 0 : 1;
         }
-        unless ( $protocol eq 'https' && $SSLAvailable ) {
+        unless ( $uri->scheme() eq 'https' && $SSLAvailable ) {
             require Foswiki::Net::HTTPResponse;
             return new Foswiki::Net::HTTPResponse(
                 "LWP not available for handling protocol: $url");
@@ -157,16 +154,6 @@ sub getExternalResource {
     my $response;
 
     try {
-        $url =~ s!^\w+://!!;    # remove protocol
-        my ( $user, $pass );
-        if ( $url =~ s!([^/\@:]+)(?::([^/\@:]+))?@!! ) {
-            ( $user, $pass ) = ( $1, $2 || '' );
-        }
-
-        unless ( $url =~ s!([^:/]+)(?::([0-9]+))?!! ) {
-            die "Bad URL: $url";
-        }
-        my ( $host, $port ) = ( $1, $2 || ( $protocol eq 'https' ? 443 : 80 ) )
 
         my $sclass;
         eval {
@@ -178,72 +165,61 @@ sub getExternalResource {
             $sclass = 'IO::Socket::INET';
         }
         my @ssloptions;
-        if ( $protocol eq 'https' ) {
+        if ( $uri->scheme() eq 'https' ) {
             $sclass     = 'IO::Socket::SSL';
             @ssloptions = (
-                SSL_hostname    => $host,
+                SSL_hostname    => $uri->host(),
                 SSL_verify_mode => IO::Socket::SSL::SSL_VERIFY_NONE(),
-            );
+                );
         }
 
         $url = '/' unless ($url);
 
-        my $req = "$method $url HTTP/1.0\r\nHost: $host";
-        if (   $protocol eq 'http' && $port == 80
-            || $protocol eq 'https' && $port == 443 )
+        my $req = "$method $url HTTP/1.0\r\nHost: ".$uri->host();
+        if (   $uri->scheme() eq 'http' && $uri->port() == 80
+               || $uri->scheme() eq 'https' && $uri->port() == 443 )
         {
             $req .= "\r\n";
         }
         else {
-            $req .= ":$port\r\n";
+            $req .= ":".$uri->port()."\r\n";
         }
 
-        my ( $proxyHost, $proxyPort );
-        $proxyHost = $this->{PROXYHOST} || $Foswiki::cfg{PROXY}{HOST};
-        $proxyPort = $this->{PROXYPORT} || $Foswiki::cfg{PROXY}{PORT};
-        if ( $proxyHost && $proxyPort ) {
-            my ( $proxyProtocol, $proxyUser, $proxyPass );
-            if ( $proxyHost =~
-                m#^(https?)://(?:(.*?)(?::(.*?))?@)?(.*)(?::(\d+))?/*# )
-            {
-                $proxyProtocol = $1;
-                $proxyUser     = $2;
-                $proxyPass     = $3;
-                $proxyHost     = $4;
-                $proxyPort     = $5 if defined $5;
-                if ( $proxyProtocol eq 'https' ) {
-                    $proxyPort = 443 if ( !$proxyPort );
-                    if ( !defined $SSLAvailable ) {
-                        eval 'require IO::Socket::SSL';
-                        $SSLAvailable = $@ ? 0 : 1;
-                    }
-                    $sclass = 'IO::Socket::SSL';
-                }
-                elsif ( !$proxyPort ) {
-                    $proxyPort = 8080;
-                }
-            }
-            if ( !defined $proxyProtocol
-                || $proxyProtocol eq 'https' && !$SSLAvailable )
+        if ( $puri ) {
+            if ( !defined $puri->scheme()
+                || $puri->scheme() eq 'https' && !$SSLAvailable )
             {
                 require Foswiki::Net::HTTPResponse;
                 return new Foswiki::Net::HTTPResponse(
-                    "Proxy settings are invalid, check configure ($proxyHost)")
+                    "Proxy settings are invalid, check configure ({PROXY}{HOST}
+            }
+            elsif ( $puri->scheme() eq 'https' ) {
+                $puri->port() = 443 if ( !$puri->port() );
+                if ( !defined $SSLAvailable ) {
+                    eval 'require IO::Socket::SSL';
+                    $SSLAvailable = $@ ? 0 : 1;
+                }
+                $sclass = 'IO::Socket::SSL';
+            }
+            elsif ( !$puri->port() ) {
+                $puri->port(8080);
             }
-            $req      = "$method $protocol://$host:$port$url HTTP/1.0\r\n";
-            $protocol = $proxyProtocol;
-            $host     = $proxyHost;
-            $port     = $proxyPort;
-            if ($proxyUser) {
+            $req      = "$method $uri->scheme()://"
+                .$uri->host().":".$uri->port()."$url HTTP/1.0\r\n";
+            $uri->scheme($puri->scheme());
+            $uri->host($puri->host());
+            $uri->port($puri->port());
+            if ($puri->userinfo()) {
                 require MIME::Base64;
                 my $base64 =
-                  MIME::Base64::encode_base64( "$proxyUser:$proxyPass", '' );
+                  MIME::Base64::encode_base64( $puri->userinfo(), '' );
                 $req .= "Proxy-Authorization: Basic $base64\r\n";
             }
         }
-        if ($user) {
+
+        if ($uri->userinfo()) {
             require MIME::Base64;
-            my $base64 = MIME::Base64::encode_base64( "$user:$pass", '' );
+            my $base64 = MIME::Base64::encode_base64( $uri->userinfo(), '' );
             $req .= "Authorization: Basic $base64\r\n";
         }
 
@@ -266,14 +242,14 @@ sub getExternalResource {
         $req .= $options{content} if defined $options{content};
 
         my $sock = $sclass->new(
-            PeerAddr => $host,
-            PeerPort => $port,
+            PeerAddr => $uri->host(),
+            PeerPort => $uri->port(),
             Proto    => 'tcp',
             Timeout  => 120,
             @ssloptions,
         );
         unless ($sock) {
-            die "Unable to connect to $host: $!"
+            die "Unable to connect to ".$uri->host().": $!"
               . ( @ssloptions ? ' - ' . IO::Socket::SSL::errstr() : '' ) . "\n"
         }
         $sock->autoflush(1);
@@ -308,16 +284,12 @@ sub getExternalResource {
 }
 
 sub _GETUsingLWP {
-    my ( $this, $url, %options ) = @_;
+    my ( $this, $uri, $puri, %options ) = @_;
 
-    my ( $user, $pass );
-    if ( $url =~ s!([^/\@:]+)(?::([^/\@:]+))?@!! ) {
-        ( $user, $pass ) = ( $1, $2 );
-    }
     my $request;
     require HTTP::Request;
     my $method = $options{method} || 'GET';
-    $request = HTTP::Request->new( $method => $url );
+    $request = HTTP::Request->new( $method => $uri->as_string() );
     my %headers = ();
     %headers = %{ $options{headers} } if $options{headers};
     $request->header(
@@ -329,7 +301,9 @@ sub _GETUsingLWP {
     $request->content( $options{content} ) if defined $options{content};
 
     require Foswiki::Net::UserCredAgent;
+    my ($user, $pass) = split(':', $uri->userinfo(), 2);
     my $ua = new Foswiki::Net::UserCredAgent( $user, $pass );
+    $ua->proxy( [ 'http', 'https' ], $puri->as_string() ) if $puri;
     my $response = $ua->request($request);
     return $response;
 }
diff --git core/lib/Foswiki/Net/UserCredAgent.pm core/lib/Foswiki/Net/UserCredAg
index 201064a..38d8cca 100644
--- core/lib/Foswiki/Net/UserCredAgent.pm
+++ core/lib/Foswiki/Net/UserCredAgent.pm
@@ -16,13 +16,6 @@ sub new {
     my $this = $class->SUPER::new();
     $this->{user} = $user;
     $this->{pass} = $pass;
-    if ( $Foswiki::cfg{PROXY}{HOST} ) {
-        my $proxy = $Foswiki::cfg{PROXY}{HOST};
-        if ( $Foswiki::cfg{PROXY}{PORT} ) {
-            $proxy .= ':' . $Foswiki::cfg{PROXY}{PORT};
-        }
-        $this->proxy( [ 'http', 'https' ], $proxy );
-    }
     return $this;
 }


Hi Crawford. The patch no longer completely fit. I managed to get it to apply with some manual cut/paste required. Had to fix some syntax errors - a missing ");, and had a unit test failure with undefined $uri->userinfo(). With those fixed, the unit tests pass. However I think they are pretty limited - no authenticated or proxy tests.

If you could review, test, and if it looks good, merge the branch, (or tell me to) and I'll get it into 2.1. I don't have proxy or auth servers to test against. Thanks.

-- Main.GeorgeClark - 15 Dec 2015 - 01:52
 
Topic revision: r8 - 03 Feb 2016, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy