Running INN
The most popular conventional news server on
Linux today is InterNetNews (INN; href="http://www.isc.org/products/inn/" target="_blank">http://www.isc.org/products/INN/ ). This package actually consists of several programs that work
together as a unified system. The core program is innd , which processes
news articles and handles incoming connections. The nnrpd program
takes over connections from news readers (as opposed to other news servers). INN
initiates connections with other news servers using the innxmit program, which in turn uses nntpsend to do much of the work. Each of these programs has an associated
configuration file; indeed, some use multiple files. The most basic
configuration files usually reside in /etc/news , but some files
are stored in /var/lib/news or elsewhere.INN is usually shipped with Linux
distributions in a package called inn or some variant of that. This chapter
describes INN version 2.2.2, but configuration for other INN 2.x -series packages should be virtually identical. Some
distributions make INN dependent upon Cleanfeed, which is an add-on package
that can automatically remove some types of spam from the news server. (Newsgroup
spam is a major problem that most end users don't notice because of the
automated spam cleaning that's so common today.)
Obtaining a News Feed
If you want to run a news server that carries
even part of Usenet, you need to obtain a news feed for the server, and
configure your system to use that feed. The latter task is covered in the
upcoming section, " href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch12lev1sec3&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch12lev3sec3#ch12lev3sec3"> Controlling Access ,"
but the former deserves a few words, as well. NNTP is designed such that you
can use just about any existing news server as a feed. The trouble is that you
can't simply enter some random news server in a configuration file and expect
the feed to work; most news servers accept feeds only from selected sites. You
must therefore locate a site that will agree to serve as your news feed. Indeed,
you might need to locate multiple feeds if you can't find a single site that
can provide all the newsgroups you want to carry.One place to start looking for a news feed is
with whoever provides your network connectivityyour ISP. Assuming you have a
connection fast enough to support a full news feed, your ISP might be willing
to feed your news server, or at least give you a pointer to some potential
providers. Many ISPs, though, don't provide news feeds to their customers. This
is particularly common for ISPs that sell primarily to residential and small
business customers, and such customers seldom have enough bandwidth to handle
the news feed in any event. Another potential source of news feeds is
third-party news providers like those mentioned earlier. For instance, NewsGuy ( href="http://www.newsguy.com" target="_blank">http://www.newsguy.com )
offers news feeds as well as access via news readers.A full news feed is likely to be costly, both
in terms of a cash outlay for the service and in terms of the hardware and
bandwidth required to support it. For instance, the NewsGuy feed costs
$1200 a month in 2002, and the recommended hardware is a 400MHz Pentium with
500MB of RAM and 64GB of disk space, connected to a link with at least 3Mbps of
bandwidth. You'll also need bandwidth for the news readers that will connect to
your server. Of course, the exact requirements may be higher or lower than
this, depending upon the exact mix of newsgroups you want to carry, how long
you want to retain posts, and so on. As a general rule, the requirements have
gone up every year, so the values cited here are likely to be very inadequate
in the not-too-distant future.
Serving News on Limited Resources
If you want to run a regular news server on limited hardware
and network connectivity, drop the binary
newsgroups. These newsgroups carry binary files, encoded to be transmissible
as Usenet posts, which were originally intended for text. Binary newsgroups
carry sound clips, digitized photos, program files, and more. Some carry
material that's copyrighted and should not be distributed in this way. Most
see a huge amount of traffic, in terms of the size of posts, because the
binary files are much larger than a typical post in a text newsgroup. Most
binary newsgroups have binary
or binaries in the newsgroup
names. Many are in the alt
hierarchy, but some are in comp
or elsewhere.If you want to provide a full news feed but lack the
resources to host it, an outsourcing arrangement is often a reasonable
compromise. You contract with a news provider and create a DNS entry for your
domain to point to your news provider's IP address. From the point of view of
your users, it appears that you're running a news server, when in fact
somebody else is doing the bulk of the work. Many small ISPs use outsourced
news servers. One potential drawback is that a news provider might or might
not be willing to create special news groups for your local use.
Configuring
INN
Configuring INN involves setting many options in many
different files. A default installation will probably set many of the options
to reasonable values, but you'll have to adjust some settings to fit your own
system. For instance, you must define your newsgroups and set up access control
features. (If you want to run a full Usenet server, your news feed may be able
to provide you with help on control files relating to newsgroups and access to
their own server.) You must also set up message expiration policies and tell
INN how to handle any special control messages it may receive (these may tell
the server to cancel a message, add a new group, and so on).
General
Configuration
The main INN configuration file is /etc/news/inn.conf . This file sets options using the
following syntax: optionname:value
Many other INN configuration files use a similar syntax. Most
of the options in inn.conf may
be left at their default values. The most important options to change are the
first few. These include: organization
This option sets the name of your organization. This string will be included in
the headers of all the posts made through your news server. server
This is the name of the computer on which INN is running. It's critical because
some of the programs that make up the INN package establish a network connection
with this computer in order to deliver articles. The default value of localhost may work, but it's best to
change this to your true hostname. pathhost
Whenever INN accepts a post, the server adds its name to a header line called Path . This header helps identify where a
message has been, and thus helps prevent loops in which a message is passed
back and forth between various servers. You should place your news server's
fully-qualified domain name here, such as news.threeroomco.com . moderatormailer
Some newsgroups are moderated, which means that
an individual (the moderator ) must approve
posts to the group before they appear. You can either keep track of moderator
mailing addresses yourself, or send posts to moderated newsgroups to a
centralized address, which will forward them to the moderator. Entering %s@uunet.uu.net will do the latter. domain
This is the name of your domain, such as threeroomco.com . It's used internally for DNS-related
functions by INN component programs. fromhost
When a local user posts a message, INN creates a From header to identify the poster. The program uses the
value of this line as the machine name, so you should use something appropriate
hereprobably your domain name, but possibly a mail machine within the domain. complaints
Unfortunately, some users abuse news. They may spam, post offensive material,
post binaries to text newsgroups, or do other obnoxious things. The complaints option lets you specify an
e-mail address that will be visible in the headers of messages that originate
from your site, so that others can contact you in the event some abuse occurs.There are many additional options in the inn.conf file, but these should have
reasonable default values. Consult the inn.conf
man page for further information about the meanings of these options.
Setting
Up Newsgroups
The inn.conf
file doesn't include any newsgroup definitions or descriptions. That task is
left to two other configuration files: active
and newsgroups , both of which
are stored in the directory set by the pathdb
option in inn.conf (normally /var/lib/news ).The active file
contains a list of newsgroups that are supported on the system. Newsgroups
appear one to a line in this file, and order doesn't matter. Each line contains
four fields, separated by whitespace: newsgroup.name himark lowmark flag
As you might expect, newsgroup.name
is the name of the newsgroup, such as comp.os.linux.misc .
The himark and lowmark fields specify the highest-
and lowest-numbered posts in the group, respectively. These values begin at 0000000000 and 0000000001 , respectively, for a new group. (INN stores
posts as individual files with filenames being sequential numbers corresponding
to a local post number. This local post number is unrelated to the message ID,
and is likely to be different from one news server to another.) The flag field contains a flag that
denotes how the newsgroup is to be treated: y This is
the most common flag; it indicates a newsgroup to which users can post. n
Newsgroups with this flag accept new messages from news feeds, but not posts
from local users or news clients. m This
flag indicates a moderated newsgroup; local posts are mailed to the group
moderator for approval. j
Newsgroups with this flag accept posts, but don't keep them; INN only passes on
new posts to its feeds. x This
newsgroup is static; new posts aren't accepted, either locally or from news
feeds. =news.group
Posts to this group are moved into the group specified by news.group . You might use this flag to
redirect posts made to a defunct newsgroup.A news server that only supports local operations is likely to
have just a few news groups. You can call these whatever you like, but
following the tiered naming convention of Usenet makes sense. For instance, you
might create all your local groups in a hierarchy that's named after your
organization, like threeroomco.support ,
threeroomco.support.bigproduct ,
and threeroomco.accounting for
three discussion groups at threeroomco.com .
If you acquire a news feed from an outside source, that source should be able
to provide you with a list of newsgroups, or even a complete active file, that you can use.As your news server operates, INN will modify the himark and lowmark fields of the active file. As
articles are added, the himark
value will increase. As older articles are expired, the lowmark value will increase, although
it may not change with every expiration or
cancellationas described later, articles may not expire in exactly the order
in which they're created.The newsgroups file
is less critical to day-to-day operation of the news server than is the active file. Like the active file, the first field of each line
of newsgroups is the name of the
newsgroup. After this group name comes one or more tabs and a description of
the newsgroup. Clients can retrieve this file to help users locate groups, or
differentiate between two groups whose names are similar.
Controlling
Access
Most sites restrict access to their news servers to prevent
abuse and conserve their network resources. There are three aspects to this
configuration: feeding news to other sites, restricting access for news feeds,
and restricting access for news clients. The first two options are important
mainly when your news server exchanges messages with others, but even if you
operate a standalone news server, you should check that it's not configured to
accept exchanges with other news servers, to avoid abuse. Configuration for
news clients is important for all news servers
Feeding
News to Other Sites
If you want messages posted by your users to reach other
sites, or if you've arranged to feed entire newsgroups to some other site, you
need to configure your system to contact other news servers to send messages on
their way. This is controlled through the /etc/news/newsfeeds
file, which contains lines of the following format: sitename:pattern [, pattern... ]:flag [, flag... ]:param
These lines can become quite long, so you can split lines by
using the backslash ( \ ) line
continuation characterany line that ends in a backslash is continued on the
following line, so you can break a very long line across multiple lines for
ease of reading and editing. Each of the colon-delimited fields has a specific
meaning, as follows: sitename
This is a code for the site. This code need not be a conventional hostname;
it's matched to a hostname in another configuration file, described shortly. pattern
A pattern is a code for
one or more newsgroups. You may specify newsgroups individually if you want to
pass on posts in just a few, or you may use the asterisk ( * ) wildcard to match any string; for
instance, comp.os.* matches all
newsgroups in the comp.os
hierarchy. If you precede a pattern by an exclamation mark ( ! ), posts in that group will not be passed on unless they're cross-posted to
another group. The at-sign ( @ )
has a similar meaning, but cross-posted messages are blocked, as well. For
instance, if you specify !comp.os.linux ,
a message cross-posted to comp.os.linux
and comp.os.linux.hardware will
be passed on as part of the latter group; but @comp.os.linux
will cause the message to not be passed on at all. INN applies patterns in
sequence, so if you specify comp.os.*,!comp.os.linux ,
INN will pass on messages in the comp.os
hierarchy except for those in comp.os.linux . Reversing the order would
pass on all groups, because the less-specific comp.os.*
would override the more specific !comp.os.linux . flag
You can include one or more flags that limit what types of messages are passed
on to the remote site. For instance, <size
restricts messages to those less than size
bytes, and Gcount passes
a post only if it's posted to fewer than count
newsgroups. The newsfeeds man
page includes a description of additional flags. param
This final field's meaning depends upon the type of news feed. It's usually the
name of a file in which the outgoing feed is stored. In other cases it may be
blank. The default newsfeeds
file includes many examples that are commented out.The newsfeeds
file controls the creation of a file that will ultimately be transmitted to
another site. The /etc/news/nntpsend.ctl
file controls how INN contacts that site. Like newsfeeds ,
nntpsend.ctl consists of four
colon-delimited fields: sitename:site.host.name:max_size: [ args ]
The sitename
is the name of the site from the newsfeeds
file, and the site.host.name
is the site's conventional hostname. You can restrict the amount of data you'll
pass in a single transfer with the max_size
argument; for instance, 2m
limits transfers to 2MB or less. Finally, you can include optional arguments
that are passed to the innxmit
program, which does the actual transmission. Consult the innxmit man page for information on the
arguments it accepts.You'll only need to deal with these configurations if you're
feeding news to other sites, or if you want to set up a feed from another site.
To be truly effective, a news feed from another site must accept posts that
originate with your users. Without this reciprocal connection, posts from your
users won't be available to readers on the Internet at large, just locally.
Thus, although you may think of yourself as accepting an external news feed,
you must configure your system to provide a feed to
your feeder, as well as accepting its input.
Setting
News Feed Access
INN can control access to itself. The main daemon, innd , accepts connections from your feeder
news sites and from various other programs in the INN package. Although innd handles the initial connection from
news clients, it passes those connections to another program as quickly as
possible. Therefore, the main innd
connections control file, /etc/news/incoming.conf ,
should list only the local computer and news feeder sites.The basic unit in the incoming.conf
file is a key/value pair, which is how you
specify attributes and their values. These take the form key :
value . These pairs may be
collected into peers, which are specifications
of individual computers. (Some key/value pairs are global in scope, though;
they don't appear in peers.) Peers may also be collected into groups. Both peers and groups use curly braces ( {} ) to delimit their extent. href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch12lev1sec3&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch12list01#ch12list01"> Listing 12.1 shows a typical incoming.conf file for a site that uses
one news feed.
Listing
12.1 A sample incoming.conf file
# Global settings streaming: true max-connections: 50 # Allow NNTP posting from localhost peer ME { hostname: "localhost, 127.0.0.1" } # Allow fiveroomco.com to send us most groups peer fiveroom { hostname: news.fiveroomco.com patterns: *,!threeroomco.* }
The most important key is hostname ,
which specifies the hostname of the computer that's to be allowed a connection.
You can list specific newsgroups that may be transferred using the patterns key, using the newsgroup-naming
conventions of the newsfeeds
file. The default is to accept all newsgroups fed by the remote system. Various
other keys are described in the incoming.conf
man page.
Setting
News Reader Access
Chances are you want authorized users to be able to access
your news server. Because innd
delegates this task to another program, you don't configure news reader access
in the incoming.conf file;
instead, you use the /etc/news/nnrp.access
file for this purpose. Each line in this file consists of five colon-delimited
entries, thus: hostname:permissions:username:password:newsgroups
The meanings of specific entries are as follows: hostname
This is the name or IP address of an individual host, or a pattern using an
asterisk wildcard to match a range of hosts, such as *.threeroomco.com to match any
client in the threeroomco.com
domain. When using IP addresses, you may use an IP address/netmask pair, as in 172.20.0.0/16 . permissions
This field contains one or more of R
(message reading is allowed), P
(posting is allowed), N (the
client may use the NEWNEWS
command), or L (the client may
post even to groups to which local posting is prohibited). These last two
options override global settings for specific clients. username
If you want to restrict access to the server based on a username and password,
you should specify the username here; when this is done, the user must
authenticate before being allowed to post. A plus sign ( + ) causes the server to try to use the
Linux password database for authentication, but this often doesn't work,
particularly when the system is configured to use shadow passwords. If you
leave this and the next field blank, no authentication will be required to read
or post news. password
This field contains the password that's required to access the news server.
Leaving this field and the username
field blank causes the system to not require authentication. newsgroups
You can specify newsgroups using patterns like those used in the newsgroups file if you want to restrict
the newsgroups to which certain hosts have access. Leaving this field blank
causes the server to make no newsgroups
available to the client, so to make all newsgroups available, the entry must
end in an asterisk ( * ).If you include multiple lines in the nnrp.access file, later lines take
precedence over earlier ones. Thus, if you want to make global settings but
provide exceptions for specific hosts, you should place the lines for the
global settings earlier in the file.
Setting
Message Expiration Options
The /etc/news/expire.ctl
file controls the automatic expiration (deletion) of messages. Most of the
lines in this file follow a pattern similar to that in other configuration
files, with five colon-delimited fields: pattern:modflag:keep:default:purge
The meanings of these fields are as follows: pattern This is a newsgroup specification. As with others, an asterisk ( * ) is a
wildcard, so * alone matches all newsgroups, comp.os.* matches the
entire comp.os hierarchy, and so on. modflag This flag is a single character that indicates the rule applies to
moderated groups only ( M ), unmoderated groups only ( U ), or all newsgroups ( A ). keep Articles may include a header called Expires that specifies a
unique expiration time for a specific article. You may set an overriding
minimum value (in days) in this field. For instance, if keep is
set to 6 , 7.5 , or some higher value, an article that's set to expire in only five
days won't expire until the later date you specify. The value of keep may
be a floating-point number, and never indicates that the article will
never expire. (Use never very cautiously, though, since it can cause your hard disk to fill
quite quickly.) default This is the most important value, since it sets the expiration
time for articles without an Expires header, which most news postings lack. As with keep , you
specify the value in days, which may be expressed as a floating-point value. The
value never means that articles
are never expired. purge The keep field lets you override an Expires header when it's
lower than you might like. The purge field lets you override an Expires header
that's longer than you might like. For instance, if you set purge to 10 and receive
an article with an Expires header that specifies it's to be kept for 100 days, your system
will expire the article after only ten days. As with keep and default ,
the value of purge may be floating-point or never .
Ongoing News Server Maintenance
INN normally runs directly as a daemon,
started by SysV startup scripts as described in href="http:// /?xmlid=0-201-77423-2/ch04#ch04"> Chapter 4 ,
Starting Servers. If you installed INN from a package included with your
distribution, you should be able to start it by running such a startup script.Some of the tasks described earlier, such as
sending messages to other servers and expiring articles, aren't handled
automatically by innd as it runs in normal operation. Instead, these tasks are controlled
by scripts or utility programs that are called by cron. If you installed INN
from a package that came with your distribution, chances are that it created
appropriate configuration files to have these tasks occur automatically by
placing crontab files in /etc/cron.d , /etc/cron . interval , or some other location. If you want to change the frequency with
which these tasks occur, you should check these locations or use your package
manager to find out what cron files were placed where. You can then modify, move, or delete the
files, and if necessary create new ones to take over these tasks.Other server maintenance tasks involve
modifying your configuration. For instance, you may need to add a new
newsgroup, delete an existing newsgroup, or temporarily disable access to the
server. These tasks can be accomplished with the ctlindd utility, which
accepts a large number of options. Type ctlindd -h to view a
list of its options.