The following simple script is available in the doc/InstallationTest.pl file. It must be run as 'root' and tests that basic functions of the Combine installation works.
Basicly it creates and initializes a new jobname, crawls one specific test page and exports it as XML. This XML is then compared to a correct XML-record for that page.
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
This example gives more details on how to write a topic filter Plug-In.
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
$weight[\mbox{term}_{i}]$
$weight[\mbox{location}_{j}]$
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
$weight[\mbox{location}_{j}]$
$hits[\mbox{location}_{j}][\mbox{term}_{i}]$
$\mbox{term}_{i}$
$\mbox{location}_{j}$
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all terms}} \left( \sum_{\mbox{all matches}} \frac{weight[\mbox{term}_{i}]}{\log(k * position[\mbox{term}_{i}][\mbox{match}_{j}]) * proximity[\mbox{term}_{i}][\mbox{match}_{j}]} \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
$position[\mbox{term}_{i}][\mbox{match}_{j}]$
$\mbox{match}_{j}$
$\mbox{term}_{i}$
$proximity[\mbox{term}_{i}][\mbox{match}_{j}]$
$\log(distance\_between\_components)$
CREATE TABLE recordurl ( recordid int(11) NOT NULL auto_increment, urlid int(11) NOT NULL default '0', lastchecked timestamp NOT NULL, md5 char(32), fingerprint char(50), KEY md5 (md5), KEY fingerprint (fingerprint), PRIMARY KEY (urlid), KEY recordid (recordid) ) ENGINE=MyISAM DEFAULT CHARACTER SET=utf8;
CREATE TABLE admin ( status enum('closed','open','paused','stopped') default NULL, schedulealgorithm enum('default','bigdefault','advanced') default 'default', queid int(11) NOT NULL default '0' ) ENGINE=MEMORY DEFAULT CHARACTER SET=utf8;
CREATE TABLE log ( pid int(11) NOT NULL default '0', id varchar(50) default NULL, date timestamp NOT NULL, message varchar(255) default NULL ) ENGINE=MyISAM DEFAULT CHARACTER SET=utf8;
combineExport - export records in XML from Combine database
combineExport -jobname name
[-profile alvis
dc
combine -charset utf8
isolatin -number
n
-recordid
n
-md5
MD5
-pipehost
server
-pipeport
n
-incremental ]
jobname is used to find the appropriate configuration (mandatory)
Three profiles: alvis, dc, and combine . alvis and combine are similar XML formats.
'alvis' profile format is defined by the Alvis enriched document format DTD. It uses charset UTF-8 per default.
'combine' is more compact with less redundancy.
'dc' is XML encoded Dublin Core data.
Selects a specific characterset from UTF-8, iso-latin-1 Overrides -profile settings.
Specifies the server-name and port to connect to and export data using the Alvis Pipeline. Exports incrementally, ie all changes since last call to combineExport with the same pipehost and pipeport.
the max number of records to be exported
Export just the one record with this recordid
Export just the one record with this MD5 checksum
Exports incrementally, ie all changes since last call to combineExport using -incremental
Generates records in Combine native format and converts them using this XSLT script before output. See example scripts in /etc/combine/*.xsl
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
$weight[\mbox{location}_{j}]$
Combine configuration documentation in /usr/share/doc/combine/.
Alvis XML schema (-profile alvis) at http://project.alvis.info/alvis_docs/enriched-document.xsd
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005 - 2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
$hits[\mbox{location}_{j}][\mbox{term}_{i}]$
combineCtrl - controls a Combine crawling job
combineCtrl action
-jobname
name
where action can be one of start, kill, load, recyclelinks, reharvest, stat, howmany, records, hosts, initMemoryTables, open, stop, pause, continue
jobname is used to find the appropriate configuration (mandatory)
takes an optional switch -harvesters n where n is the number of crawler processes to start
kills all active crawlers (and their associated combineRun monitors) for jobname
Read a list of URLs from STDIN (one per line) and schedules them for crawling
Schedule all newly found (since last invocation of recyclelinks) links in crawled pages for crawling
Schedules all pages in the database for crawling again (in order to check if they have changed)
opens database for URL scheduling (maybe after a stop)
stops URL scheduling
pauses URL scheduling
continues URL scheduling after a pause
prints out rudimentary status of the ready queue (ie eligible now) of URLs to be crawled
prints out rudimentary status of all URLs to be crawled
prints out the number of ercords in the SQL database
prints out rudimentary status of all hosts that have URLs to be crawled
initializes the administrative MySQL tables that are kept in memory
Implements various control functionality to administer a crawling job, like starting and stoping crawlers, injecting URLs into the crawl queue, scheduling newly found links for crawling, controlling scheduling, etc.
This is the preferred way of controling a crawl job.
Seed the crawling job aatest with a URL
Start 3 crawling processes for job aatest
Schedule all new links crawling
See how many URLs that are eligible for crawling right now.
combine
Combine configuration documentation in /usr/share/doc/combine/.
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
combineRun - starts, monitors and restarts a combine harvesting process
combineRun pidfile
combine command to run
Starts a program and monitors it in order to make sure there is alsways a copy running. If the program dies it will be restarted with the same parameters. Used by combineCtrl when starting combine crawling.
combineCtrl
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
combineReClassify - main program that reanalyse records in a combine database
Algorithm:
select relevant records based on cls parameter
for each record
get record from database
delete analyse infor from the record
analyse the record
if still_relevant
save in database
combineSVM - generate a SVM model from good and bad examples
combineSVM -jobname name
[-good
good-file
] [-bad
bad-file
] [-train
model-file
] [-help]
jobname is used to find the appropriate configuration (mandatory)
good is the name of a file with good URLs, one per line. Default 'goodURL.txt'
bad is the name of a file with bad URLs, one per line. Default 'badURL.txt'
train is the name of the file where the trained SVM model will be stored. Default 'SVMmodel.txt'
Takes two files, one with positive examples (good) and one with negative examples (bad) and
trains a SVM classifier using these. The resulting model is stored in the file train
.
The example files should contain one URL per line and nothing else.
combine
Combine configuration documentation in /usr/share/doc/combine/.
Ignacio Garcia Dorado
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2008 Ignacio Garcia Dorado, Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
combineRank - calculates various Ranks for a Combine crawled database
combineRank action
-jobname
name
-verbose
where action can be one of PageRank, PageRankBL, NetLocRank, and exportLinkGraph. Results on STDOUT.
jobname is used to find the appropriate configuration (mandatory)
verbose enables printing of ranks to STDOUT as SQL INSERT statements
calculate standard PageRank
calculate PageRanks with backlinks added for each link
calculate SiteRank for each site and a local DocRank for documents within each site. Global ranks are then calulated as SiteRank * DocRank
export linkgraph from Combine database
Implements calculation of different variants of PageRank.
Results are written to STDOUT and can be huge for large databases.
Linkgraph is exported in ASCII as a sparse matrix, one row per line. First integer is the ID (urlid) of a page with links. The rest of integers on the line are IDs for pages linked to. Ie 121 5624 23416 51423 267178 means that page 121 links to pages 5624 23416 51423 267178
calculate PageRank with backlinks, result on STDOUT
export the linkgraph to STDOUT
combine
Combine configuration documentation in /usr/share/doc/combine/.
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
combineUtil - various operations on the Combine database
combineUtil action
-jobname
name
where action can be one of stats, termstat, classtat, sanity, all, serveralias, resetOAI, restoreSanity, deleteNetLoc, deletePath, deleteMD5, deleteRecordid, addAlias
jobname is used to find the appropriate configuration (mandatory)
Global statistics about the database
generates statistics about the terms from topic ontology matched in documents (can be long output)
generates statistics about the topic classes assigned to documents
Performs various sanity checks on the database
Deletes records which sanity checks finds insane
Removes all history (ie 'deleted' records) from the OAI table. This is done by removing the OAI table and recreating it from the existing database.
Does the actions: stats, sanity, classtat, termstat
Deletes all records matching the ','-separated list of server net-locations (server-names optionally with port) in the switch -netlocstr. Net-locations can include SQL wild cards ('%').
Deletes all records matching the ','-separated list of URl paths (excluding net-locations) in the switch -pathsubstr. Paths can include SQL wild cards ('%').
Delete the record which has the MD5 in switch -md5
Delete the record which has the recordid in switch -recordid
Detect server aliases in the current database and do a 'addAlias' on each detected alias.
Manually add a serveralias to the system. Requires switches -aliases and -preferred
Does various statistics generation as well as performing sanity checks on the database
Generate matched term statistics
combine
Combine configuration documentation in /usr/share/doc/combine/.
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
Combine - Focused Web crawler framework
combine -jobname name
-logname
id
jobname is used to find the appropriate configuration (mandatory)
logname is used as identifier in the log (in MySQL table log)
Does crawling, parsing, optional topic-check and stores in MySQL database Normally started with the combineCtrl command. Briefly it get's an URL from the MySQL database, which acts as a common coordinator for a Combine job. The Web-page is fetched, provided it passes the robot exclusion protocoll. The HTML ic cleaned using Tidy and parsed into metadata, headings, text, links and link achors. Then it is stored (optionaly provided a topic-check is passed to keep the crawler focused) in the MySQL database in a structured form.
A simple workflow for a trivial crawl job might look like:
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
For more complex jobs you have to edit the job configuration file.
combineINIT, combineCtrl
Combine configuration documentation in /usr/share/doc/combine/.
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
PosMatcher
This a module in the DESIRE automatic classification system. Copyright 1999.
Exported routines: 1. Fetching text: These routines all extract texts from a document (either a Combine record, a Combine XWI datastructure or a WWW-page identified by a URL. They all return: $meta, $head, $text, $url, $title, $size $meta: Metadata from document $head: Important text from document $text: Plain text from document $url: URL of the document $title: HTML title of the document $size: The size of the document
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
2. Term matcher accepts a text as a (reference) parameter, matches each term in Term against text Matches are recorded in an associative array with class as key and summed weight as value. Match parameters: $text, $termlist $text: text to match against the termlist $termlist: object pointer to a LoadTermList object with a termlist loaded output: %score: an associative array with classifications as keys and scores as values
3. Heuristics: sum scores down the classification tree to the leafs cleanEiTree parameters: %res - an associative array from Match output: %res - same array
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005,2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
selurl - Normalise and validate URIs for harvesting
Selurl selects and normalises URIs on basis of both general practice (hostname lowercasing, portnumber substsitution etc.) and Combine-specific handling (aplpying config_allow, config_exclude, config_serveralias and other relevant config settings).
The Config settings catered for currently are:
maxUrlLength - the maximum length of an unnormalised URL allow - Perl regular to identify allowed URLs exclude - Perl regular expressions to exclude URLs from harvesting serveralias - Aliases of server names sessionids - List sessionid markers to be removed
A selurl object can hold a single URL and has methods to obtain its subparts as defined in URI.pm, plus some methods to normalise and validate it in Combine context.
Currently, the only schemes supported are http, https and ftp. Others may or may not work correctly. For one thing, we assume the scheme has an internet hostname/port.
clone() will only return a copy of the real URI object, not a new selurl.
URI URI-escapes the strings fed into it by new() once. Existing percent signs in the input are left untouched, which implicates that:
(a) there is no risk of double-encoding; and
(b) if the original contained an inadvertent sequence that could be interpreted as an escape sequence, uri_unescape will not render the original input (e.g. url_with_%66_in_it goes whoop) If you know that the original has not yet been escaped and wish to safeguard potential percent signs, you'll have to escape them (and only them) once before you offer it to new().
A problem with URI is, that its object is not a hash we can
piggyback our data on, so I had to resort to AUTOLOAD to emulate
inheritance. I find this ugly, but well, this *is* Perl, so what'd
you expect?
XWI.pm - class for internal representation of a document record
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
$weight[\mbox{location}_{j}]$
$hits[\mbox{location}_{j}][\mbox{term}_{i}]$
$\mbox{term}_{i}$
$\mbox{location}_{j}$
Provides methods for storing and retrieving structured records representing crawled documents.
Saves $val using AUTOLOAD. Can later be retrieved, eg
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
will set $t to 'My value'
Forget all values.
*_get will start with the first value.
stores values into the datastructure
retrieves values from the datastructure
Stores the content of Meta-tags
Takes/Returns 2 parameters: Name, Content
\begin{displaymath} \sum_{\mbox{all terms}} \left( \sum_{\mbox{all matches}} \frac{weight[\mbox{term}_{i}]}{\log(k * position[\mbox{term}_{i}][\mbox{match}_{j}]) * proximity[\mbox{term}_{i}][\mbox{match}_{j}]} \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
Extended information from Meta-tags. Not used.
Stores all URLs (ie if multiple URLs for the same page) for this record
Takes/Returns 1 parameter: URL
Stores headings from HTML documents
Takes/Returns 1 parameter: Heading text
Stores links from documents
Takes/Returns 5 parameters: URL, netlocid, urlid, Anchor text, Link type
Stores calculated information, like genre, language, etc
Takes/Returns 2 parameters Name, Value. Both are strings with max length Name: 15, Value: 20
Stores result of topic classification.
Takes/Returns 5 parameters: Class, Absolute score, Normalized score, Terms, Algorithm id
Class, Terms, and Algorithm id are strings with max lengths Class: 50, and Algorithm id: 25
Absolute score, and Normalized score are integers
Normalized score and Terms are optional and may be replaced with 0, and '' respectively
Combine focused crawler main site http://combine.it.lth.se/
Yong Cao tsao@munin.ub2.lu.se
v0.05 1997-03-13
Anders Ardö, anders.ardo@it.lth.se
Copyright (C) 2005,2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
Matcher
This a module in the DESIRE automatic classification system. Copyright 1999. Modified in the ALVIS project. Copyright 2004
Exported routines: 1. Fetching text: These routines all extract texts from a document (either a Combine XWI datastructure or a WWW-page identified by a URL. They all return: $meta, $head, $text, $url, $title, $size $meta: Metadata from document $head: Important text from document $text: Plain text from document $url: URL of the document $title: HTML title of the document $size: The size of the document
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
2. Term matcher accepts a text as a (reference) parameter, matches each term in Term against text Matches are recorded in an associative array with class as key and summed weight as value. Match parameters: $text, $termlist $text: text to match against the termlist $termlist: object pointer to a LoadTermList object with a termlist loaded output: %score: an associative array with classifications as keys and scores as values
Anders Ardö anders.ardo@it.lth.se
Copyright (C) 2005,2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
Combine::FromTeX.pm - TeX parser in combine package
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
SD_SQL
Reimplementation of sd.pl SD.pm and SDQ.pm using MySQL contains both recyc and guard
Basic idea is to have a table (urldb) that contains most URLs ever
inserted into the system together with a lock (the guard function) and
a boolean harvest-flag. Also in this table is the host part together with
its lock. URLs are selected from this table based on urllock, netloclock and
harvest and inserted into a queue (table que). URLs from this queue
are then given out to harvesters. The queue is implemented as:
# The admin table can be used to generate sequence numbers like this:
#mysql update admin set queid=LAST_INSERT_ID(queid+1);
# and used to extract the next URL from the queue
#mysql
select host,url from que where queid=LAST_INSERT_ID();
#
When the queue is empty it is filled from table urldb. Several different
algorithms can be used to fill it (round-robin, most urls, longest time
since harvest, ...). Since the harvest-flag and guard-lock are not updated
until the actual harvest is done it is OK to delete the queue and
regenerate it anytime.
########################## #Questions, ideas, TODOs, etc #Split table urldb into 2 tables - one for urls and one for hosts??? #Less efficient when filling que; more efficient when updating netloclock #Datastruktur TABLE hosts: create table hosts( host varchar(50) not null default '', netloclock int not null, retries int not null default 0, ant int not null default 0, primary key (host), key (ant), key (netloclock) );
############# Handle to many retries?
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
Anders Ardö anders.ardo@it.lth.se
Copyright (C) 2005,2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
utilPlugIn
Utilities for: * extracting text from XWI's * SVM classification * language and country identification
Ignacio Garcia Dorado
Anders Ardö anders.ardo@eit.lth.se
Copyright (C) 2008 Ignacio Garcia Dorado, Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
Combine::FromHTML.pm - HTML parser in combine package
Yong Cao tsao@munin.ub2.lu.se
v0.06 1997-03-19
Anders Ardø 1998-07-18
added
AREA ... HREF=link ...
fixed
A ... HREF=link ...
regexp to be more general
Anders Ardö 2002-09-20
added 'a' as a tag not to be replaced with space
added removal of Cntrl-chars and some punctuation marks from IP
added
style
...
/style
as something to be removed before processing
beefed up compression of sequences of blanks to include
240 (non-breakable space)
changed 'remove head' before text extraction to handle multiline matching (which can be
introduced by decoding html entities)
added compress blanks and remove CRs to metadata-content
Anders Ardö 2004-04
Changed extraction process dramatically
RobotRules.pm
Anders Ardo version 1.0 2004-02-19
HTMLExtractor
Adopted from HTML::LinkExtractor - Extract links from an HTML document by D.H (PodMaster)
D.H (PodMaster)
Copyright (c) 2003 by D.H. (PodMaster). All rights reserved.
This module is free software;
you can redistribute it and/or modify it under
the same terms as Perl itself.
The LICENSE file contains the full text of the license.
LoadTermList
This a module in the DESIRE automatic classification system. Copyright 1999.
LoadTermList - A class for loading and storing a stoplist with single words a termlist with classifications and weights
\begin{displaymath} \mbox{Relevance\_score} = \end{displaymath}
\begin{displaymath} \sum_{\mbox{all locations}} \left( \sum_{\mbox{all terms}} (hits[\mbox{location}_{j}][\mbox{term }_{i}] * weight[\mbox{term}_{i}] * weight[\mbox{location}_{j}]) \right) \end{displaymath}
$weight[\mbox{term}_{i}]$
$weight[\mbox{location}_{j}]$
Anders Ardö Anders.Ardo@it.lth.se
Copyright (C) 2005,2006 Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
classifySVM
Classification plugin module using SVM (implementation SVMLight)
Uses SVM model loaded from file pointed to by configuration variable 'SVMmodel'
Ignacio Garcia Dorado
Anders Ardö anders.ardo@eit.lth.se
Copyright (C) 2008 Ignacio Garcia Dorado, Anders Ardö
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
See the file LICENCE included in the distribution at
http://combine.it.lth.se/
root 2008-10-13