Quick HOWTO : Ch32 : Controlling Web Access with Squid

来自Ubuntu中文
Oneleaf留言 | 贡献2007年7月2日 (一) 22:35的版本 (新页面: {{From|http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch32_:_Controlling_Web_Access_with_Squid}} {{Languages|Quick HOWTO : Ch32 : Controlling Web Access with Squid}} = In...)
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳到导航跳到搜索

{{#ifexist: :Quick HOWTO : Ch32 : Controlling Web Access with Squid/zh | | {{#ifexist: Quick HOWTO : Ch32 : Controlling Web Access with Squid/zh | | {{#ifeq: {{#titleparts:Quick HOWTO : Ch32 : Controlling Web Access with Squid|1|-1|}} | zh | | }} }} }} {{#ifeq: {{#titleparts:Quick HOWTO : Ch32 : Controlling Web Access with Squid|1|-1|}} | zh | | }}

Introduction

Two important goals of many small businesses are to:

  • Reduce Internet bandwidth charges
  • Limit access to the Web to only authorized users.

The Squid web caching proxy server can achieve both these goals fairly easily.

Users configure their web browsers to use the Squid proxy server instead of going to the web directly. The Squid server then checks its web cache for the web information requested by the user. It will return any matching information that finds in its cache, and if not, it will go to the web to find it on behalf of the user. Once it finds the information, it will populate its cache with it and also forward it to the user's web browser.

As you can see, this reduces the amount of data accessed from the web. Another advantage is that you can configure your firewall to only accept HTTP web traffic from the Squid server and no one else. Squid can then be configured to request usernames and passwords for each user that users its services. This provides simple access control to the Internet.

Download and Install The Squid Package

Most RedHat Linux software products are available in the RPM format. Downloading and installing RPMs isn't hard. If you need a refresher, Chapter 6, "Installing Linux Software", provides details. It is best to use the latest version of Squid.

Starting Squid

Use the chkconfig configure Squid to start at boot::

[root@bigboy tmp]# chkconfig squid on

Use the service command to start, stop, and restart Squid after booting:

[root@bigboy tmp]# service squid start
[root@bigboy tmp]# service squid stop
[root@bigboy tmp]# service squid restart

You can test whether the Squid process is running with the pgrep command:

[root@bigboy tmp]# pgrep squid

You should get a response of plain old process ID numbers.

The /etc/squid/squid.conf File

The main Squid configuration file is squid.conf, and, like most Linux applications, Squid needs to be restarted for changes to the configuration file can take effect.

The Visible Host Name

Squid will fail to start if you don't give your server a hostname. You can set this with the visible_hostname parameter. Here, the hostname is set to the real name of the server bigboy.

visible_hostname bigboy

Access Control Lists

You can limit users' ability to browse the Internet with access control lists (ACLs). Each ACL line defines a particular type of activity, such as an access time or source network, they are then linked to an http_access statement that tells Squid whether or not to deny or allow traffic that matches the ACL.

Squid matches each Web access request it receives by checking the http_access list from top to bottom. If it finds a match, it enforces the allow or deny statement and stops reading further. You have to be careful not to place a deny statement in the list that blocks a similar allow statement below it. The final http_access statement denies everything, so it is best to place new http_access statements above it

Note: The very last http_access statement in the squid.conf file denies all access. You therefore have to add your specific permit statements above this line. In the chapter's examples, I've suggested that you place your statements at the top of the http_access list for the sake of manageability, but you can put them anywhere in the section above that last line.

Squid has a minimum required set of ACL statements in the ACCESS_CONTROL section of the squid.conf file. It is best to put new customized entries right after this list to make the file easier to read.

Restricting Web Access By Time

You can create access control lists with time parameters. For example, you can allow only business hour access from the home network, while always restricting access to host 192.168.1.23.

#
# Add this to the bottom of the ACL section of squid.conf
#
acl home_network src 192.168.1.0/24
acl business_hours time M T W H F 9:00-17:00
acl RestrictedHost src 192.168.1.23

#
# Add this at the top of the http_access section of squid.conf
#
http_access deny RestrictedHost
http_access allow home_network business_hours

Or, you can allow morning access only:

#
# Add this to the bottom of the ACL section of squid.conf
#
acl mornings time 08:00-12:00
 
#
# Add this at the top of the http_access section of squid.conf
#
http_access allow mornings

Restricting Access to specific Web sites

Squid is also capable of reading files containing lists of web sites and/or domains for use in ACLs. In this example we create to lists in files named /usr/local/etc/allowed-sites.squid and /usr/local/etc/restricted-sites.squid.

# File: /usr/local/etc/allowed-sites.squid
www.openfree.org
linuxhomenetworking.com

# File: /usr/local/etc/restricted-sites.squid
www.porn.com
illegal.com

These can then be used to always block the restricted sites and permit the allowed sites during working hours. This can be illustrated by expanding our previous example slightly.

#
# Add this to the bottom of the ACL section of squid.conf
#
acl home_network src 192.168.1.0/24
acl business_hours time M T W H F 9:00-17:00
acl GoodSites dstdomain "/usr/local/etc/allowed-sites.squid"
acl BadSites  dstdomain "/usr/local/etc/restricted-sites.squid"

#
# Add this at the top of the http_access section of squid.conf
#
http_access deny BadSites
http_access allow home_network business_hours GoodSites

Restricting Web Access By IP Address

You can create an access control list that restricts Web access to users on certain networks. In this case, it's an ACL that defines a home network of 192.168.1.0.

#
# Add this to the bottom of the ACL section of squid.conf
#
acl home_network src 192.168.1.0/255.255.255.0

You also have to add a corresponding http_access statement that allows traffic that matches the ACL:

#
# Add this at the top of the http_access section of squid.conf
#
http_access allow home_network

Password Authentication Using NCSA

You can configure Squid to prompt users for a username and password. Squid comes with a program called ncsa_auth that reads any NCSA-compliant encrypted password file. You can use the htpasswd program that comes installed with Apache to create your passwords. Here is how it's done:

1) Create the password file. The name of the password file should be /etc/squid/squid_passwd, and you need to make sure that it's universally readable.

[root@bigboy tmp]# touch /etc/squid/squid_passwd
[root@bigboy tmp]# chmod o+r /etc/squid/squid_passwd

2) Use the htpasswd program to add users to the password file. You can add users at anytime without having to restart Squid. In this case, you add a username called www:

[root@bigboy tmp]# htpasswd /etc/squid/squid_passwd www
New password:
Re-type new password:
Adding password for user www
[root@bigboy tmp]#

3) Find your ncsa_auth file using the locate command.

[root@bigboy tmp]# locate ncsa_auth
/usr/lib/squid/ncsa_auth
[root@bigboy tmp]#

4) Edit squid.conf; specifically, you need to define the authentication program in squid.conf, which is in this case ncsa_auth. Next, create an ACL named ncsa_users with the REQUIRED keyword that forces Squid to use the NCSA auth_param method you defined previously. Finally, create an http_access entry that allows traffic that matches the ncsa_users ACL entry. Here's a simple user authentication example; the order of the statements is important:

#
# Add this to the auth_param section of squid.conf
#
auth_param basic program /usr/lib/squid/ncsa_auth /etc/squid/squid_passwd
 
#
# Add this to the bottom of the ACL section of squid.conf
#
acl ncsa_users proxy_auth REQUIRED
 
#
# Add this at the top of the http_access section of squid.conf
#
http_access allow ncsa_users

5) This requires password authentication and allows access only during business hours. Once again, the order of the statements is important:

#
# Add this to the auth_param section of squid.conf
#
auth_param basic program /usr/lib/squid/ncsa_auth /etc/squid/squid_passwd
 
#
# Add this to the bottom of the ACL section of squid.conf
#
acl ncsa_users proxy_auth REQUIRED
acl business_hours time M T W H F 9:00-17:00

#
# Add this at the top of the http_access section of squid.conf
#
http_access allow ncsa_users business_hours

Remember to restart Squid for the changes to take effect.

Forcing Users To Use Your Squid Server

If you are using access controls on Squid, you may also want to configure your firewall to allow only HTTP Internet access to only the Squid server. This forces your users to browse the Web through the Squid proxy.

Making Your Squid Server Transparent To Users

It is possible to limit HTTP Internet access to only the Squid server without having to modify the browser settings on your client PCs. This called a transparent proxy configuration. It is usually achieved by configuring a firewall between the client PCs and the Internet to redirect all HTTP (TCP port 80) traffic to the Squid server on TCP port 3128, which is the Squid server's default TCP port.

Squid Transparent Proxy Configuration

Your first step will be to modify your squid.conf to create a transparent proxy. The procedure is different depending on your version of Squid.

Prior to version 2.6: In older versions of Squid, transparent proxy was achieved through the use of the httpd_accel options which were originally developed for http acceleration. In these cases, the configuration syntax would be as follows:

httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on

Version 2.6 and Beyond: Newer versions of Squid simply require you to add the word "transparent" to the default "http_port 3128" statement. In this example, Squid not only listens on TCP port 3128 for proxy connections, but will also do so in transparent mode.

http_port 3128 transparent

Configuring iptables to Support the Squid Transparent Proxy

The examples below are based on the discussion of Linux iptables in Chapter 14, "Linux Firewalls Using iptables". Additional commands may be necessary for you particular network topology.

In both cases below, the firewall is connected to the Internet on interface eth0 and to the home network on interface eth1. The firewall is also the default gateway for the home network and handles network address translation on all the network's traffic to the Internet.

Only the Squid server has access to the Internet on port 80 (HTTP), because all HTTP traffic, except that coming from the Squid server, is redirected.

If the Squid server and firewall are the same server, all HTTP traffic from the home network is redirected to the firewall itself on the Squid port of 3128 and then only the firewall itself is allowed to access the Internet on port 80.

iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 \
        -j REDIRECT --to-port 3128
iptables -A INPUT -j ACCEPT -m state \
        --state NEW,ESTABLISHED,RELATED -i eth1 -p tcp \
        --dport 3128
iptables -A OUTPUT -j ACCEPT -m state \
        --state NEW,ESTABLISHED,RELATED -o eth0 -p tcp \
        --dport 80
iptables -A INPUT -j ACCEPT -m state \
        --state ESTABLISHED,RELATED -i eth0 -p tcp \
        --sport 80
iptables -A OUTPUT -j ACCEPT -m state \
        --state ESTABLISHED,RELATED -o eth1 -p tcp \
        --sport 80

Note: This example is specific to HTTP traffic. You won't be able to adapt this example to support HTTPS web browsing on TCP port 443, as that protocol specifically doesn't allow the insertion of a "man in the middle" server for security purposes. One solution is to add IP masquerading statements for port 443, or any other important traffic, immediately after the code snippet. This will allow non HTTP traffic to access the Internet without being cached by Squid.

If the Squid server and firewall are different servers, the statements are different. You need to set up iptables so that all connections to the Web, not originating from the Squid server, are actually converted into three connections; one from the Web browser client to the firewall and another from the firewall to the Squid server, which triggers the Squid server to make its own connection to the Web to service the request. The Squid server then gets the data and replies to the firewall which then relays this information to the Web browser client. The iptables program does all this using these NAT statements:

iptables -t nat -A PREROUTING -i eth1 -s ! 192.168.1.100 \
        -p tcp --dport 80 -j DNAT --to 192.168.1.100:3128
iptables -t nat -A POSTROUTING -o eth1 -s 192.168.1.0/24 \
        -d 192.168.1.100 -j SNAT --to 192.168.1.1
iptables -A FORWARD -s 192.168.1.0/24 -d 192.168.1.100 \
        -i eth1 -o eth1 -m state 
         --state NEW,ESTABLISHED,RELATED \
        -p tcp --dport 3128 -j ACCEPT
 iptables -A FORWARD -d 192.168.1.0/24 -s 192.168.1.100 \
        -i eth1 -o eth1 -m state --state ESTABLISHED,RELATED \
        -p tcp --sport 3128 -j ACCEPT

In the first statement all HTTP traffic from the home network except from the Squid server at IP address 192.168.1.100 is redirected to the Squid server on port 3128 using destination NAT. The second statement makes this redirected traffic also undergo source NAT to make it appear as if it is coming from the firewall itself. The FORWARD statements are used to ensure the traffic is allowed to flow to the Squid server after the NAT process is complete. The unusual feature is that the NAT all takes place on one interface; that of the home network (eth1).

You will additionally have to make sure your firewall has rules to allow your Squid server to access the Internet on HTTP TCP port 80 as covered in Chapter 14, "Linux Firewalls Using iptables".

Manually Configuring Web Browsers To Use Your Squid Server

If you don't have a firewall that supports redirection, then you need to configure your firewall to only accept HTTP Internet access from the Squid server, as well as configure your PC browser's proxy server settings manually to use the Squid server. The method you use depends on your browser.

For example, to make these changes using Internet Explorer

  1. Click on the "Tools" item on the menu bar of the browser.
  2. Click on "Internet Options"
  3. Click on "Connections"
  4. Click on "LAN Settings"
  5. Configure with the address and TCP port (3128 default) used by your Squid server.

Here's how to make the same changes using Mozilla or Firefox.

  1. Click on the "Edit" item on the browser's menu bar.
  2. Click on "Preferences"
  3. Click on "Advanced"
  4. Click on "Proxies"
  5. Configure with the address and TCP port (3128 default) used by your Squid server under "Manual Proxy Configuration"

Squid Disk Usage

Squid uses the /var/spool/squid directory to store its cache files. High usage squid servers need a large amount of disk space in the /var partition to get optimum performance.

Every webpage and image accessed via the Squid server is logged in the /var/log/squid/access.log file. This can get quite large on high usage servers. Fortunately, the logrotate program automatically purges this file.

Troubleshooting Squid

Squid logs both informational and error messages to files in the /var/log/squid/ directory. It is best to review these files first whenever you have difficulties.The squid.out file can be especially useful as it contains Squids' system errors.

Another source of errors could be unintended statements in the squid.conf file that cause no errors; mistakes in the configuration of hours of access and permitted networks that were forgotten to be added are just two possibilities.

By default, Squid operates on port 3128, so if you are having connectivity problems, you'll need to follow the troubleshooting steps in Chapter 4, "Simple Network Troubleshooting", to help rectify them.

Note: Some of Squid's capabilities go beyond the scope of this book, but you should be aware of them. For example, for performance reasons, you can configure child Squid servers on which certain types of content are exclusively cached. Also, you can restrict the amount of disk space and bandwidth Squid uses.

Conclusion

Tools such as Squid are popular with many company mangers. By caching images and files on a server shared by all, Internet bandwidth charges can be reduced.

Squid's password authentication feature is well liked because it allows only authorized users to access the Internet as a means of reducing usage fees and distractions in the office. Unfortunately, an Internet access password is usually not viewed as a major security concern by most users who are often willing to share it with their colleagues. Although it is beyond the scope of this book, you should consider automatically tying the Squid password to the user's regular login password. This will make them think twice about giving their passwords away. Internet access is one thing, letting your friends have full access to your e-mail and computer files is quite another.