Welcome to the new location of Alien's Wiki, sharing a single dokuwiki install with the SlackDocs Wiki.

Welcome to Eric Hameleers (Alien BOB)'s Wiki pages.

If you want to support my work, please consider a small donation:

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
slackware:proxy [2006/06/07 18:39] alienslackware:proxy [2006/06/16 22:47] (current) – Removed FIXME for link to parental control article. alien
Line 1: Line 1:
 ====== Transparent Proxy with contentfilter ====== ====== Transparent Proxy with contentfilter ======
- 
-**FIXME** UNDER CONSTRUCTION **FIXME** 
  
 ===== Introduction ===== ===== Introduction =====
Line 7: Line 5:
 This article describes the setup of a http proxy chained to a content filter (think "parental control" for instance). This article describes the setup of a http proxy chained to a content filter (think "parental control" for instance).
  
-There are many cases, where is not desirable to grant unlimited Internet access to certain groups of people. The most obvious group are children, either at home, or in schools, whom you want to protect from hitting upon too explicit imagery or language. Or perhaps you just want to block certain sites from them.\\ +There are many cases, where is not desirable to grant unlimited Internet access to certain groups of people. The most obvious group are children, either at home, in schools or public libraries, whom you want to protect from hitting upon too explicit imagery or language. Or perhaps you just want to block certain sites from them.\\ 
 Other uses for a content-filtering web proxy would be companies that want to limit the accessibility of the Internet facility they allow their employees. Content filtering does not stop at blocking undesirable content - blocking viruses in downloaded materials and malicious HTML code is another form of filtering incoming web traffic.\\  Other uses for a content-filtering web proxy would be companies that want to limit the accessibility of the Internet facility they allow their employees. Content filtering does not stop at blocking undesirable content - blocking viruses in downloaded materials and malicious HTML code is another form of filtering incoming web traffic.\\ 
 A web proxy is what you need in this case. The proxy server intercepts all browser requests for web pages, and in co-operation with one or more filtering programs, decides whether the browser will or won't be able to retrieve the full content it was requesting. A web proxy is what you need in this case. The proxy server intercepts all browser requests for web pages, and in co-operation with one or more filtering programs, decides whether the browser will or won't be able to retrieve the full content it was requesting.
Line 15: Line 13:
 Ideally, you would want a solution where such a filtering proxy server is installed on the Internet Gateway Server, and the desktop computers would not have to be be reconfigured to make use of the proxy. This is called //transparent proxying// - the proxy will intercept the browser requests without the end users knowing their traffic is monitored - until they hit a censored page of course! Ideally, you would want a solution where such a filtering proxy server is installed on the Internet Gateway Server, and the desktop computers would not have to be be reconfigured to make use of the proxy. This is called //transparent proxying// - the proxy will intercept the browser requests without the end users knowing their traffic is monitored - until they hit a censored page of course!
  
-I will show you how you can use [[http://tinyproxy.sf.net/|tinyproxy]] and [[http://dansguardian.org/|dansguardian]] in combination with a few iptables firewall rules, to accomplish such a transparent proxy. I chose tinyproxy because it is small and very fast in comparison with the more widely used Squid proxy. Tinyproxy is just fine for small to medium-sized networks. For larger networks, you can just replace tinyproxy with Squid and it will still work (if you configure Squid right). But I will concentrate on tinyproxy.+I will show you how you can use [[http://tinyproxy.sf.net/|tinyproxy]] and [[http://dansguardian.org/|dansguardian]] in combination with a few iptables firewall rules, to accomplish such a transparent proxy service. I chose tinyproxy because it is small and very fast in comparison with the more widely used Squid proxy. Tinyproxy is just fine for small to medium-sized networks. For larger networks, you can replace tinyproxy with Squid without much effort and it will still work (if you configure Squid right). But I will concentrate on tinyproxy for now. 
 <note>Tinyproxy and Squid are licenced under the GPL. Dansguardian is licenced under the GPL, with the addition that it is free for non-commercial use.</note> <note>Tinyproxy and Squid are licenced under the GPL. Dansguardian is licenced under the GPL, with the addition that it is free for non-commercial use.</note>
 +
 +<note tip>This article focuses on the configuration of a transparent proxy on a gateway/router for **small networks**. Another scenario is that of the **family computer** with a single network interface, running Linux, where you want to restrict the children in their Internet browsing while still being able to have unrestricted Internet access for your own account (assuming //you// are the parent) or as the root user.\\ I have a [[:slackware:parentalcontrol|Wiki page]] that points out the different steps you need to take compared to this very page here.</note>
 +
 +<note warning>When using this proxy/contentfilter, it will not be possible for the content filter to examine //HTTPS// requests. This is of course due to the nature of the encryption used - if it //were// possible for the content filter to examine the content of secure HTTPS connections, then this would pose a serious threat to all secure communication on the Internet. This would be called the "man in the middle attack".\\ The tinyproxy by itself can proxy the HTTPS traffic because //it// does not need to inspect the content of the HTTPS traffic, it just passes the received data on to the client browser. This is the reason why in the rest of the article, there will be a few examples of redirecting HTTPS traffic (tcp port 443); it is only for the benefit of people who use this article to just setup a proxy without filtering.</note>
  
 ===== How it works ===== ===== How it works =====
Line 36: Line 39:
                ->(contentfilter -> proxy) _|                                          ->(contentfilter -> proxy) _|                          
  
-The browser request is again routed through the default gateway as in the first picture, only now we have an ''iptables'' rule in place that detects the traffic targeted at an external web server (ports 80 or 443). The iptables rule causes this traffic to be //re-directs// to the port where our content filter is listening (''192.168.2.1:8080'')!\\ The browser is unaware of this "hi-jacking" process. Browser requests will be filtered and pages retrieved by the proxy will be examined and scanned just like happens in the second picture.+The browser request is again routed through the default gateway as in the first picture, only now we have an ''iptables'' rule in place that detects the traffic targeted at an external web server (ports 80 or 443). The iptables rule causes this traffic to be //re-directed// to the port where our content filter is listening (''192.168.2.1:8080'')!\\ The browser is unaware of this "hi-jacking" process. Browser requests will be examined by the contentfilter and URLs that appear in a blacklist will trigger an immediate //access denied// page to be returned to the browser. Other page requests are retrieved by the proxy and then examined and scannedjust like was shown in the second picture.
  
-<note> I touched on the issue of scanning for viruses and malicious HTML code - dansguardian can not do this by itself, but if compiled with the appropriate support (see [[#building_dansguardian|below]]) and configured to actually use it, dansguardian will ask an available ClamAV virus scanning daemon to do the scanning.</note>+<note> I touched on the issue of scanning for viruses and malicious HTML code - dansguardian can not do this by itself, but if compiled with the appropriate support (see [[#building_dansguardian|below]]) and configured to actually use it, dansguardian will contact an available ClamAV virus scanning daemon and let it do the scanning.</note>
  
 ===== Network layout ===== ===== Network layout =====
  
-For the sake of simplicity, I will assume the proxy server and content filter will be installed on the server that is also acting as the Internet Gateway. This server has two network interfaces, one connecting to your ADSL/Cable router and the other is connected to your internal network.\\ This also means that this server's IP address is configured as the de       fault gateway on the computers in your network. If the proxy server is going to be installed on another server than the Internet Gateway, you will have to do some additional work to redirect traffic from the Gateway to the proxy server and back out through the Gateway to the Internet. This is left as an excercise to the reader :-)+For the sake of simplicity, I will assume the proxy server and content filter will be installed on the server that is also acting as the Internet Gateway. This server has two network interfaces, one connecting to your ADSL/Cable router and the other is connected to your internal network.\\ This also means that this server's IP address is configured as the default gateway on the computers in your network. If the proxy server is going to be installed on another server than the Internet Gateway, you will have to do some additional work to redirect traffic from the Gateway to the proxy server and back out through the Gateway to the Internet. This is left as an excercise to the reader :-)
  
 The TCPIP configuration of our network will be as follows: The TCPIP configuration of our network will be as follows:
Line 48: Line 51:
 //Table 1.// __Server:__ //Table 1.// __Server:__
 ^ Interface      ^ Type    ^ IP address      ^ Netmask            ^ Default gateway  ^ ^ Interface      ^ Type    ^ IP address      ^ Netmask            ^ Default gateway  ^
-| eth0           | dynamic | 10.111.111.129  | 255.255.255.128    | 10.111.111.254   | +| eth0           | dynamic | 10.111.111.1   | 255.255.255.128    | 10.111.111.254   | 
-| eth1           | static  | 192.168.2.1     | 255.255.255.0      |                  |+| eth1           | static  | 192.168.2.1    | 255.255.255.0      |                  |
  
 //Table 2.// __Internal Network:__ //Table 2.// __Internal Network:__
Line 123: Line 126:
 make install make install
 </code> I have a SlackBuild and a Slackware package for dansguardian in [[http://www.slackware.com/~alien/slackbuilds/dansguardian/|my repository]] which you can use as well. The advantage being that I added a start script and a logrotate script to the package. If you want those without building from my SlackBuild script, I added them in the [[#example_configuration_files|last section]].\\  </code> I have a SlackBuild and a Slackware package for dansguardian in [[http://www.slackware.com/~alien/slackbuilds/dansguardian/|my repository]] which you can use as well. The advantage being that I added a start script and a logrotate script to the package. If you want those without building from my SlackBuild script, I added them in the [[#example_configuration_files|last section]].\\ 
-I configured dansguardian to run as user //nobody// - because that is an existing account without provileges, and Apache uses it too. If you want another account change the ''./configure'' step, and create the account you want it to use in case the account does not yet exist. We will configure tinyproxy to run as user //nobody// as well, but in that case, we don't have to define that at compile-time. Tinyproxy has the effective user as a parameter in it's configuration file (see below).+I configured dansguardian to run as user //nobody// - because that is an existing account without provileges, and Apache uses it too. If you want another account change the ''./configure'' step, and create the account you want it to use in case the account does not yet exist. We will configure tinyproxy to run as user //nobody// as well, but in tinyproxy'case, we don't have to define that at compile-time. Tinyproxy has the effective user as a parameter in it's configuration file (see below).
  
  
 ===== Configuration ===== ===== Configuration =====
  
-That was it! Now, it is time to start configuring our proxy server.+That was it! Now, it is time to start configuring our proxying service.
  
 ====  tinyproxy config ==== ====  tinyproxy config ====
  
-This is how the content of the tinyproxy configuration file ''/etc/tinyproxy/tinyproxy.conf'' (stripped of comments and empty lines) should look for our example setup: <file+The content of the tinyproxy configuration file ''/etc/tinyproxy/tinyproxy.conf'' (stripped of comments and empty lines) for our example setup can be found in the [[#example_configuration_files|last section]].\\  
-User nobody +I entered the domain name for my internal lan //my.net// in this configuration file. If yours is different, please change accordingly.\\ To show you where this differs from the tinyproxy defaults, here is a diff from the original file: <code diff
-Group nogroup +diff /etc/tinyproxy/tinyproxy.conf-dist /etc/tinyproxy/tinyproxy.conf 
-Port 3128 +20c20 
-Listen 127.0.0.1 +< Port 8888 
-Bind 10.111.111.1 +--- 
-Timeout 600 +Port 3128 
-DefaultErrorFile "/usr/share/tinyproxy/default.html" +27c27 
-StatFile "/usr/share/tinyproxy/stats.html" +< #Listen 192.168.0.1 
-Logfile "/var/log/tinyproxy.log" +--- 
-LogLevel Info +Listen 127.0.0.1 
-PidFile "/var/run/tinyproxy.pid" +34c34 
-XTinyproxy qemu.lan +< #Bind 192.168.0.1 
-MaxClients 100 +--- 
-MinSpareServers 5 +Bind 10.111.111.1 
-MaxSpareServers 20 +112c112 
-StartServers 10 +< #XTinyproxy mydomain.com 
-MaxRequestsPerChild +--- 
-Allow 127.0.0.1 +> XTinyproxy my.net 
-Allow 192.168.0.0/24 +192c192 
-ViaProxyName "tinyproxy" +< Allow 192.168.1.0/25 
-ConnectPort 443 +--- 
-ConnectPort 563 +> Allow 192.168.2.0/24 
-</file>+</code> The important lines here are as follows:\\ 
 +  Port 3128 
 +  Listen 127.0.0.1 
 +  Bind 10.111.111.1 
 +  Allow 127.0.0.1 
 +  Allow 192.168.2.0/24 
 +These achieve the following:\\  
 +  * make tinyproxy listen on ''127.0.0.1:3128'' where dansguardian will contact it 
 +  * bind to the external interface (IP address ''10.111.111.1'') which makes all traffic direction Internet originate from the external interface (this is a required line for a host with multiple network interfaces) 
 +  * allow the localhost (IP address 127.0.0.1, for dansguardian) as well as all the computers in your internal LAN (IP address range 192.168.2.0-192.168.2.255) access -implicitly denying access attempts from any other IP address but those.
  
  
 ==== dansguardian config ==== ==== dansguardian config ====
  
 +You will find the content of the dansguardian configuration file ''/etc/dansguardian/dansguardian.conf'' (stripped of comments and empty lines) for our example setup in the [[#example_configuration_files|last section]]. The difference with the originally distributed file is quite small: <code diff>
 +diff /etc/dansguardian/dansguardian.conf.new /etc/dansguardian/dansguardian.conf
 +48a50
 +> anonymizelogs = off
 +74c76
 +< filterip =
 +---
 +> filterip = 192.168.2.1
 +97c99
 +< accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl'
 +---
 +> accessdeniedaddress = 'http://192.168.2.1/cgi-bin/dansguardian.pl'
 +</code> The important lines in this file are:
 +  filterip = 192.168.2.1
 +  filterport = 8080
 +  proxyip = 127.0.0.1
 +  proxyport = 3128
 +They show that dansguardian
 +  * will listen at the //filterip/filterport// address, i.e. ''192.168.2.1:8080''. Sound familiar? This is the Proxy URL! Dansguardian is the primary entry point for the browsers' http requests.
 +  * will look for a compatible proxy at IP address:port ''127.0.0.1:3128''. This matches exactly with how we configured tinyproxy.
  
 +The line
 +  accessdeniedaddress = 'http://192.168.2.1/cgi-bin/dansguardian.pl'
 +does not really matter, because in dansguardian's (and our) default configuration, the //access denied// web page is generated from a language-dependent template page. That is why you define ''language = 'ukenglish''' - if you want dansguardian to show it's messages in another language, look in directory ''/usr/share/dansguardian/languages/'' for the available languages.
 +
 +Of course, there is a lot of fine-tuning possibilities in this configuration file, as well as many others in the ''/etc/dansguardian'' directory tree. But in it's default setup dansguardian already does some impressive (perhaps //aggressive// is the better word) filtering.
  
-**FIXME** UNDER CONSTRUCTION **FIXME** 
  
 ===== The iptables firewall ===== ===== The iptables firewall =====
Line 191: Line 227:
    -j REDIRECT --to-ports 8080    -j REDIRECT --to-ports 8080
 </code> </code>
 +
 +===== Starting the programs =====
 +
 +If you (built and) installed my Slackware package for dansguardian, the rc script is installed non-executable by default. In order to run dansguardian on boot (as shown below) you will have to make the script executable by running <code>
 +chmod +x /etc/rc.d/rc.dansguardian</code>
 +
 +If you configured your firewall rules in the file ''/etc/rc.d/rc.firewall'', then this script will be detected by Slackware and automatically started with the ''start'' parameter on boot. This happens in the the Slackware init script ''/etc/rc.d/rc.inet2'' to be precise, like this: <code>
 +if [ -x /etc/rc.d/rc.firewall ]; then
 +  /etc/rc.d/rc.firewall start
 +fi
 +</code> so we don't have to worry about that. The important bit is the order in which tinyproxy and dansguardian are started: if dansguardian does not find a proxy service listening at the configured address, it will refuse to start. So, we add these lines to the file ''/etc/rc.d/rc.local'': <code>
 +if [ -x /usr/sbin/tinyproxy ]; then
 +  /usr/sbin/tinyproxy > /dev/null 2>&1
 +fi
 +
 +# Start dansguardian
 +if [ -x /etc/rc.d/rc.dansguardian ]; then
 +  /etc/rc.d/rc.dansguardian start
 +fi
 +</code>
 +Both programs log their actions; to respectively ''/var/log/tinyproxy.log'' and ''/var/log/dansguardian/access.log''. If their logging is a bit too verbose to your taste (the default is to log a //lot//) you can turn the log levels down in the configuration files.
 +
 +That is all there is to it! Now, test your rig by booting a client computer in your LAN and trying out a couple of URLs. I will leave it to your own imagination as to what URLs will be considered //naughty//.
 +
 +To stop these programs if you need to, you run <code>
 +/etc/rc.d/rc.dansguardian stop
 +killall -TERM tinyproxy
 +</code>
 +
 +===== Adding virus scanning =====
 +
 +
 +**FIXME** UNDER CONSTRUCTION **FIXME**
  
 ===== Pitfalls ===== ===== Pitfalls =====
  
 There might be cases where you don't want transparent proxying. For instance, some applications will not correctly connect through a transparent proxy. The (client side) user agent does not know it passes a proxy, so it possibly will not send correct HTTP headers to the remote server. Most modern browsers are standards-compliant however and will work fine. If your users need Internet Explorer, it must be newer than 5.5_SP1. For those cases where transparent proxying is impossible, you must configure your browsers explicitly to use the proxy. How to do this in a semi-centralized way is described in [[#manual_proxy_configuration|the next section]]. Web browsers that are specifically configured to use a proxyhave no problems connecting to external servers, because they "know" they use a proxy and add cache-aware HTTP headers to each request. Because of that, the remote server knows the client is behind a proxy and adjusts it's behaviour. There might be cases where you don't want transparent proxying. For instance, some applications will not correctly connect through a transparent proxy. The (client side) user agent does not know it passes a proxy, so it possibly will not send correct HTTP headers to the remote server. Most modern browsers are standards-compliant however and will work fine. If your users need Internet Explorer, it must be newer than 5.5_SP1. For those cases where transparent proxying is impossible, you must configure your browsers explicitly to use the proxy. How to do this in a semi-centralized way is described in [[#manual_proxy_configuration|the next section]]. Web browsers that are specifically configured to use a proxyhave no problems connecting to external servers, because they "know" they use a proxy and add cache-aware HTTP headers to each request. Because of that, the remote server knows the client is behind a proxy and adjusts it's behaviour.
- 
  
 ===== Manual proxy configuration ===== ===== Manual proxy configuration =====
  
 A handy way of browser configuration is the //Proxy Auto-Configuration// or PAC standard. On one of your internal webservers you install a "pac" file with proxy parameters, and then you instruct all your browsers to go fetch that "pac" file and interpret it's directives. The advantages to just entering the proxy server's IP address and port number are, that you can define a much more fine-grained configuration in the "pac" file, and you can change it's contents without having to re-do your manual browser configuration. The "pac" file will be downloaded and interpreted //every// time a browser starts.\\ A handy way of browser configuration is the //Proxy Auto-Configuration// or PAC standard. On one of your internal webservers you install a "pac" file with proxy parameters, and then you instruct all your browsers to go fetch that "pac" file and interpret it's directives. The advantages to just entering the proxy server's IP address and port number are, that you can define a much more fine-grained configuration in the "pac" file, and you can change it's contents without having to re-do your manual browser configuration. The "pac" file will be downloaded and interpreted //every// time a browser starts.\\
-You can read more about PAC on the [[http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html|Netscape web pages]] (yes, Netscape :-) )+You can read more about PAC on the [[http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html|Netscape web pages]] (yes, Netscape :-) ). This is also [[http://homepages.tesco.net/J.deBoynePollard/FGA/web-browser-auto-proxy-configuration.html|an excellent page]].
  
 To give an example, create a file on your Gateway server's //DocumentRoot// and call it ''proxy.pac''. This will make it available under the URL "%%http://192.168.2.1/proxy.pac%%". Let the contents of the ''proxy.pac'' file be like this: <file> To give an example, create a file on your Gateway server's //DocumentRoot// and call it ''proxy.pac''. This will make it available under the URL "%%http://192.168.2.1/proxy.pac%%". Let the contents of the ''proxy.pac'' file be like this: <file>
Line 235: Line 303:
  
 ===== Example configuration files ===== ===== Example configuration files =====
 +
 +''/etc/tinyproxy/tinyproxy.conf'':
 +<file>
 +User nobody
 +Group nogroup
 +Port 3128
 +Listen 127.0.0.1
 +Bind 10.111.111.1
 +Timeout 600
 +DefaultErrorFile "/usr/share/tinyproxy/default.html"
 +StatFile "/usr/share/tinyproxy/stats.html"
 +Logfile "/var/log/tinyproxy.log"
 +LogLevel Info
 +PidFile "/var/run/tinyproxy.pid"
 +XTinyproxy my.net
 +MaxClients 100
 +MinSpareServers 5
 +MaxSpareServers 20
 +StartServers 10
 +MaxRequestsPerChild 0
 +Allow 127.0.0.1
 +Allow 192.168.2.0/24
 +ViaProxyName "tinyproxy"
 +ConnectPort 443
 +ConnectPort 563
 +</file>
 +
 +''/etc/dansguardian/dansguardian.conf'':
 +<file>
 +reportinglevel = 3
 +languagedir = '/usr/share/dansguardian/languages'
 +language = 'ukenglish'
 +loglevel = 2
 +logexceptionhits = on
 +logfileformat = 1
 +anonymizelogs = off
 +filterip = 192.168.2.1
 +filterport = 8080
 +proxyip = 127.0.0.1
 +proxyport = 3128
 +accessdeniedaddress = 'http://192.168.2.1/cgi-bin/dansguardian.pl'
 +nonstandarddelimiter = on
 +usecustombannedimage = 1
 +custombannedimagefile = '/usr/share/dansguardian/transparent1x1.gif'
 +filtergroups = 1
 +filtergroupslist = '/etc/dansguardian/lists/filtergroupslist'
 +bannediplist = '/etc/dansguardian/lists/bannediplist'
 +exceptioniplist = '/etc/dansguardian/lists/exceptioniplist'
 +showweightedfound = on
 +weightedphrasemode = 2
 +urlcachenumber = 1000
 +urlcacheage = 900
 +scancleancache = on
 +phrasefiltermode = 2
 +preservecase = 0
 +hexdecodecontent = 0
 +forcequicksearch = 0
 +reverseaddresslookups = off
 +reverseclientiplookups = off
 +logclienthostnames = off
 +createlistcachefiles = on
 +maxuploadsize = -1
 +maxcontentfiltersize = 256
 +maxcontentramcachescansize = 2000
 +maxcontentfilecachescansize = 20000
 +filecachedir = '/tmp'
 +deletedownloadedtempfiles = on
 +initialtrickledelay = 20
 +trickledelay = 10
 +downloadmanager = '/etc/dansguardian/downloadmanagers/fancy.conf'
 +downloadmanager = '/etc/dansguardian/downloadmanagers/default.conf'
 +contentscannertimeout = 60
 +contentscanexceptions = off
 +recheckreplacedurls = off
 +forwardedfor = off
 +usexforwardedfor = off
 +logconnectionhandlingerrors = on
 +logchildprocesshandling = off
 +maxchildren = 120
 +minchildren = 8
 +minsparechildren = 4
 +preforkchildren = 6
 +maxsparechildren = 32
 +maxagechildren = 500
 +maxips = 0
 +ipcfilename = '/tmp/.dguardianipc'
 +urlipcfilename = '/tmp/.dguardianurlipc'
 +ipipcfilename = '/tmp/.dguardianipipc'
 +nodaemon = off
 +nologger = off
 +logadblocks = off
 +softrestart = off
 +mailer = '/usr/sbin/sendmail -t'
 +</file>
  
 ''/etc/rc.d/rc.dansguardian'': ''/etc/rc.d/rc.dansguardian'':
 Transparent Proxy with contentfilter ()
SlackDocs