help | logout
integcheck

integcheck

Name

integcheck -- Downloads and checks remote HTTP pages.

Description

The integcheck module is designed to download a set of remote HTTP pages and check them based on a set of rules. It connects to an HTTP server and downloads a page (which can either be accessible by the public, or protected based on login/password authentication). Once retrieved, integcheck applies two rules to the document to assess its validity:

  • Error Keywords

    If a word or phrase listed in this category is found in the document, then integcheck reports an error. Things like "404: Not Found", "Permission Denied", "Internal Server Error", or other words that should not appear on your page belong in this category.

  • Required Keywords

    These are words that should appear on your page, and should often be something unique to this page. If integcheck does not find any of these words or phrases on the page, it reports an error.

Note that when integcheck applies these rules, it does NOT strip out the HTML. Instead, be careful to select words or phrases that will not be matched within HTML (or, select words or phrases of HTML that are unique to that page).

When integcheck fails to retrieve a document in the specified timeout, it will try again at the next cycle (if multiple retries are specified). In this case, it may take several cycles for integcheck to report a remote page is not retrievable, and integcheck will only report an error after it has tried to retrieve the page the specified number of times. If no retries are specified, then integcheck will fail the page as soon as it cannot retrieve it.

The authentication for integcheck is simplistic: it will only authenticate using the "Basic" HTTP authentication method. In other words, if you navigate to a page with a popular GUI web browser, and a login/password prompt appears in a dialog on your screen, integcheck can (probably) authenticate with that. Note that integcheck does not authenticate using HTML forms, nor does it handle SSL encryption.

Configuration

The integcheck module can be configured to get its list of URLs (along with keywords and authentication information) in two ways. Sites may listed in a MySQL database, or may be entered directly into the rspd.conf configuration file (or the RSPD Configuration dialog box in Windows). In some cases where the number of URLs may be large or subject to frequent change, it may be easier to administer them in a MySQL table. If the list is small and relatively constant, it is probably easier to directly input them to intecheck.

Obtaining sites via MySQL

If you choose to use a database, then the database's information must be given to integcheck from the RSPD configuration file. Configuring these parameters is accomplished as follows. The following parameters are applicable to the integcheck module. Some are optional and have default values, while some are required.

  • driver

    Currently only supports "mysql".

  • hostname

    The address (or IP) of the MySQL database server (default is localhost).

  • port

    The port of the MySQL database server (default is 3306).

  • user

    The user name to log into the MySQL database server.

  • password

    The password for the user name of the MySQL database server. Can be empty (using "") if no password is necessary for the given user.

  • database

    The database to use in the MySQL database server (default is integcheck).

  • sites_list

    The name of the table list in the database (specified above) containing the integcheck sites to check.

Here is an example configuration:

# Example configuration of integcheck, using MySQL to get sites
config
{
	driver = mysql
	hostname = 192.168.1.4
	user = rsp
	password = dtr1234
	database = integcheck
	sites_list = integcheck_sites
}
			

To specify a list of sites, use a MySQL database. Create a new database (call it "integcheck"). Then, create a new table, as follows:

create table integcheck_sites (name VARCHAR(128) DEFAULT '' NOT NULL, url text, error_keywords VARCHAR(255) DEFAULT '', require_keywords VARCHAR(255) DEFAULT '', retries int(30) DEFAULT '2' NOT NULL, timeout int(30) DEFAULT '100' NOT NULL, login VARCHAR(64) DEFAULT '', password VARCHAR(64) DEFAULT '', PRIMARY KEY (name));

You now have the appropriate table and database needed for integcheck. Next, insert each site to monitor into the integcheck_sites table, as follows:

insert into integcheck_sites values('draconis software main page', 'http://www.dracoware.com/index.php', '"404", "Permission Denied"', '"Draconis Software: Home"', 2, 5, '', '');

Note the syntax for the keywords listings. They must be specified in this manner if integcheck is to recognize what keywords to look for. Insert each site you wish to monitor in the above fashion, and integcheck will download the list of URLs to monitor. Each time you change the list of URLs, you must restart the RSPD (by issuing a hangup signal).

Most of the columns should be self-explanatory. "Retries" refers to the number of times integcheck will attempt to access a site before it treats it as an error and possibly a crossed threshold. "Timeout" is the number seconds for which integcheck will attempt to reach the sites.

Obtaining sites via configuration file

As an alternate way of giving sites to integcheck, URLs may be specified directly in the configuration file for RSPD.

Sites are given to integcheck in a manner similar to the way thresholds are set for RSPD. Each site is given a name, and then parameters for that site are set by referring to its name. Possible parameters are "url", "error_keywords", "require_keywords", "login"/"password" (if authentication is needed), and optionally "retries" and "timeout". These last two respectively refer to the number of times a connection will be attempted before the site is deemed down, and the number of seconds for which integcheck will attempt to reach a site. All of of these parameters with the exception of "url" may set globally by using them without any site name. If set like this the value will apply to all of the sites listed. The following example should make this all clear.

# Example configuration of integcheck, using rspd.conf to get sites
config
{
	# Global settings that will be the defaults for sites
	retires = 2
	timeout = 5

	# Sites are listed here
	# If you want a site name with more than one word, put it in double quotes
	site = main_page
	main_page.url = http://www.mysite.com/index.php
	main_page.error_keywords = "Internal Server Error", "Permission Denied"
	main_page.require_keywords = "My Company Inc: Home"

	# This page requires authentication
	site = admin_page
	admin_page.url = http://www.mysite.com/admin/reports/index.html
	admin_page.error_keywords = "Internal Server Error", "Not Found"
	admin_page.login = admin
	admin_page.password = dDr332Hp

	# This page overrides the default timeout value
	site = "site map for homepage"
	"site map for homepage".url = http://www.mysite.com/site_map.php
	"site map for homepage".require_keywords = "Site Map", "My Company"
	"site map for homepage".timeout = 30
}
			

Thresholds

There are a number of different possible thresholds for integcheck. Sites may fail because they are down, because an error keyword was found, or because a require keyword was not found. Thresholds may be set if one of these events happens for a site, or if any of them happen. Likewise a threshold can be set if there is a problem with any of the sites given to integcheck.

Thresholds in integcheck do not use any logical operators. Instead, a threshold gives a possible error condition, and notifies integcheck that the event occurring constitutes a crossed threshold. Sites are referred to by their names given in the configuration. You can append one of the following to the name to refer to its event: ".connection", ".error_keywords", ".require_keywords". So if "main_page.connection" was given as a threshold, then you would be warned if the site is inaccessible, but not if it fails any kind of keyword check. If only the name of a site is given with no event, then the threshold will be crossed if any kind of error occurs.

You can also use "any" to refer to any of the sites. So, to be warned if any of the sites go down, "any.connection" could be used as a threshold. If you just specify "any", then any kind of error on any of the sites would constitute a threshold crossing.

Here are some examples of possible thresholds for integcheck:

# Example thresholds for integcheck
thresh1.threshold = IntegCheck.any.connection
thresh2.threshold = IntegCheck.main_page.error_keywords
thresh3.threshold = IntegCheck."site map for homepage"
		

The first threshold shown will be crossed if any of the sites become unreachable. The second will cross if one of the error keywords is found on site "main_page". The third crosses if "site map for hompage" is unreachable, if an error keyword is found, or if a require keyword is not found.

Some of these thresholds may be crossed in multiple ways. For example, a threshold of "any.connection" might be crossed if site1 goes down or if site2 goes down. Integcheck will send multiple warnings if a threshold is crossed in different ways. For example say the threshold "any.connection" is initially not crossed. When site1 goes down, an email warning is sent (assuming the user has configured the threshold in this way). The threshold is now crossed. Say site2 goes down as well. A second email warning will be sent even though the threshold has remained crossed. This way users do not have to worry about missing information if they use a broad type of threshold.

History Data

Integcheck does not currently save graphing data since none of the data is numerical. However the data strings that integcheck outputs may still be stored for further viewing and analysis.