SAP NetWeaver '04

com.sapportals.wcm.service.crawler
Interface ICrawler

[contained in: com.sap.km.cm.service.base.par - km.shared.service.crawler_api.jar]
All Known Subinterfaces:
IScheduledCrawler, ISpecialCrawler

Deprecated. as of NW04. The crawler service was replaced by the xcrawler service com.sapportals.wcm.service.xcrawler

public interface ICrawler

Crawler interface.

Copyright 2004 SAP AG


Field Summary
static int DEPTH_FLAT
          Deprecated. Constant for a 'flat' crawl, where only the starting point and it's direct childs are crawled (this is the default value).
static int DEPTH_FULL
          Deprecated. Constant for a 'full' crawl, where the complete hierarchy is crawled.
static int HIERARCHICAL_AUTO
          Deprecated. Constant for semi-hierarchical crawl (using childs and HREF properties).
static int HIERARCHICAL_OFF
          Deprecated. Constant for a non-hierarchical crawl (using HREF properties).
static int HIERARCHICAL_ON
          Deprecated. Constant for a hierarchical crawl (using childs).
static int PRIORITY_MAX
          Deprecated. Constant for maximum prio.
static int PRIORITY_MIN
          Deprecated. Constant for minimum prio.
 
Method Summary
 void cancelCrawl()
          Deprecated. Cancel a running crawler.
 void crawl()
          Deprecated. Start a new crawl for the crawler's default id.
(same as crawl(null, false))
 void crawl(java.lang.String crawlID)
          Deprecated. Start a new crawl for a given id. (same as crawl(crawlID, false))
 void crawl(java.lang.String crawlID, boolean retry)
          Deprecated. Start a crawl for a given id.
 void delete()
          Deprecated. Delete the crawler (tell the ICrawlerService that it won't be used any more).
 ICrawlerQueue getBackgroundQueue()
          Deprecated. Get the crawlers background queue (if it's a background crawler).
 boolean getCaseSensitiveFlag()
          Deprecated. Get the case sensitive flag of the crawler.
 long getContentSizeLimit()
          Deprecated. Get the content size limit of the crawler.
The limit determines the maximum content size of the resources the crawler should pass to the result receiver in byte.
 java.lang.String getCrawlID()
          Deprecated. Get the id of the last crawl.
 int getDepth()
          Deprecated. Get the maximum crawling depth set for the crawl.
 java.lang.String[] getDocumentsInAccess()
          Deprecated. Get the documents the crawler is currently accessing.
 int getExternalLinkDepth()
          Deprecated. Get the maximum crawling depth for external links set for the crawl.
 boolean getFollowExternalLinksFlag()
          Deprecated. Check if the flag to follow external links is set.
 boolean getFollowInternalLinksFlag()
          Deprecated. Check if the flag to follow internal links is set.
 int getHierarchicalCrawlMode()
          Deprecated. Get the mode for a hierachical crawl.
 java.lang.String getID()
          Deprecated. Get the crawler's ID.
 boolean getIncludeVersionsFlag()
          Deprecated. Check if the flag to include versions of resources is set.
 int getInternalLinkDepth()
          Deprecated. Get the maximum crawling depth for internal links set for the crawl.
 ICrawlerVisitedList getLastVisitedList()
          Deprecated. Get the list of visited resources for a given id.
 java.lang.String getName(java.util.Locale locale)
          Deprecated. Get the crawler's name.
 int getNiceness()
          Deprecated. Get the niceness factor of the crawler.
 int getPriority()
          Deprecated. Get the priority of the crawler.
 IGenericQuery getPropertyQuery()
          Deprecated. Get the query expression for searching the properties.
 ICrawlerResultReceiver getResultReceiver()
          Deprecated. Get the result reciever for this crawler.
 IResourceList getStartResources()
          Deprecated. Get the starting point(s) for this crawler.
 ICrawlerStatistics getStatistics()
          Deprecated. Get the crawler's statistics.
 boolean getTestMode()
          Deprecated. Get the testmode flag.
 long getTimeLimit()
          Deprecated. Get the time limit of the crawler.
The limit determines the maximum running time of the crawler in msec.
 java.lang.String getType()
          Deprecated. Get the crawler's type.
 boolean isBackground()
          Deprecated. Check if it's a background crawler.
 boolean isBackgroundQueued()
          Deprecated. Check if it's a background crawler and if the crawler is currently queued for execution (crawl() or recrawl() was called, but not yet executed).
 boolean isCrawling()
          Deprecated. Check, if the crawler is currently running (maybe suspended or resumed).
 boolean isStopping()
          Deprecated. Check, if the crawler is currently stopping (stopp was signaled but not yet finished).
 boolean isSuspended()
          Deprecated. Check, if the crawler is currently suspended (running, but suspended).
 void recrawl(java.lang.String crawlID)
          Deprecated. Start another crawl (delta crawl) for a given id.
 void resume()
          Deprecated. Resume a previously suspended running crawler.
 void setCaseSensitiveFlag(boolean flag)
          Deprecated. Set the case sensitive flag for the crawler.
 void setContentSizeLimit(long limit)
          Deprecated. Set the content size limit for the crawler.
 void setDepth(int depth)
          Deprecated. Set the maximum crawling depth for the crawl.
 void setExternalLinkDepth(int depth)
          Deprecated. Set the maximum crawling depth for external links for the crawl.
Please note: using setFollowExternalLinksFlag(false) will be treated as a setFollowExternalLinksFlag(0).
 void setFollowExternalLinksFlag(boolean flag)
          Deprecated. Set the flag to follow external links.
 void setFollowInternalLinksFlag(boolean flag)
          Deprecated. Set the flag to follow internal links.
 void setHierachicalCrawlMode(int mode)
          Deprecated. Set the hierarchical crawl mode.
 void setIncludeVersionsFlag(boolean flag)
          Deprecated. Set the flag to include versions of resources.
 void setInternalLinkDepth(int depth)
          Deprecated. Set the maximum crawling depth for internal links for the crawl.
Please note: using setFollowInternalLinksFlag(false) will be treated as a setFollowInternalLinksFlag(0).
 void setNiceness(int niceness)
          Deprecated. Set the niceness factor for the crawler.
 void setPriority(int priority)
          Deprecated. Set the priority for the crawler.
 void setPropertyQuery(IGenericQuery query)
          Deprecated. Set the query expression for searching properties.
 void setResultReceiver(ICrawlerResultReceiver receiver)
          Deprecated. Set the result reciever for this crawler.
 void setStartResource(IResource resource)
          Deprecated. Set the starting point for this crawler.
 void setStartResources(IResourceList resources)
          Deprecated. Set the starting points for this crawler.
 void setStatistics(boolean flag)
          Deprecated. Set the crawler's flag for collecting statistics or not.
Please note: a crawler with it's statistics flag set to false won't remain in the list of crawler after finished.
 void setTestMode(boolean flag)
          Deprecated. Turn testmode on or off (in testmode the result reciever won't be called.
 void setTimeLimit(long limit)
          Deprecated. Set the time limit for the crawler.
 boolean setToBackground(ICrawlerQueue queue)
          Deprecated. Mark the crawler as a background task.
 boolean supportsDelta()
          Deprecated. Check if this crawler supports the IDeltaResultReceiver interfaces.
 boolean supportsNavigation()
          Deprecated. Check if this crawler supports onUp and onDown for a result receiver.
 void suspend()
          Deprecated. Suspend a running crawler.
 

Field Detail

DEPTH_FLAT

public static final int DEPTH_FLAT
Deprecated. 
Constant for a 'flat' crawl, where only the starting point and it's direct childs are crawled (this is the default value).

DEPTH_FULL

public static final int DEPTH_FULL
Deprecated. 
Constant for a 'full' crawl, where the complete hierarchy is crawled.

HIERARCHICAL_OFF

public static final int HIERARCHICAL_OFF
Deprecated. 
Constant for a non-hierarchical crawl (using HREF properties).

HIERARCHICAL_ON

public static final int HIERARCHICAL_ON
Deprecated. 
Constant for a hierarchical crawl (using childs).

HIERARCHICAL_AUTO

public static final int HIERARCHICAL_AUTO
Deprecated. 
Constant for semi-hierarchical crawl (using childs and HREF properties).

PRIORITY_MAX

public static final int PRIORITY_MAX
Deprecated. 
Constant for maximum prio.

PRIORITY_MIN

public static final int PRIORITY_MIN
Deprecated. 
Constant for minimum prio.
Method Detail

getID

public java.lang.String getID()
Deprecated. 
Get the crawler's ID.
Returns:
a String with the crawler's unique ID.

getType

public java.lang.String getType()
Deprecated. 
Get the crawler's type.
Returns:
a String with the crawler's type.

getName

public java.lang.String getName(java.util.Locale locale)
Deprecated. 
Get the crawler's name.
Returns:
a String with the crawler's name.

getStatistics

public ICrawlerStatistics getStatistics()
Deprecated. 
Get the crawler's statistics.
Returns:
a ICrawlerStatistics object with the crawler's statistics data or null if there are no statistics available (the crawler did never run).

setStatistics

public void setStatistics(boolean flag)
Deprecated. 
Set the crawler's flag for collecting statistics or not.
Please note: a crawler with it's statistics flag set to false won't remain in the list of crawler after finished.
Parameters:
flag - a boolean with true to tell the crawler that it should collect statistical information, false if it should not. statistics data or null if there are no statistics available (the crawler did never run).

getResultReceiver

public ICrawlerResultReceiver getResultReceiver()
                                         throws WcmException
Deprecated. 
Get the result reciever for this crawler.
Returns:
a ICrawlerResultReceiver which receives the crawler's results.
Throws:
WcmException - if the result receiver cannot be retrieved.

setResultReceiver

public void setResultReceiver(ICrawlerResultReceiver receiver)
                       throws WcmException
Deprecated. 
Set the result reciever for this crawler.
Parameters:
receiver - a ICrawlerResultReceiver which receives the crawler's results.
Throws:
WcmException - if the result receiver cannot be set.

getStartResources

public IResourceList getStartResources()
                                throws WcmException
Deprecated. 
Get the starting point(s) for this crawler.
Returns:
a IResourceList with the list of resources to start crawling from.
Throws:
WcmException - if the list cannot be retrieved.

setStartResources

public void setStartResources(IResourceList resources)
                       throws WcmException
Deprecated. 
Set the starting points for this crawler.
Parameters:
resources - a IResourceList with the list of resources to start crawling from.
Throws:
WcmException - if the resourcelist cannot be set.

setStartResource

public void setStartResource(IResource resource)
                      throws WcmException
Deprecated. 
Set the starting point for this crawler.
Parameters:
resource - a IResource with the resources to start crawling from.
Throws:
WcmException - if the resource(list) cannot be set.

getDepth

public int getDepth()
             throws WcmException
Deprecated. 
Get the maximum crawling depth set for the crawl.
Returns:
an int with the maximum depth to crawl.
Throws:
WcmException - if the depth cannot be retrieved.

setDepth

public void setDepth(int depth)
              throws WcmException
Deprecated. 
Set the maximum crawling depth for the crawl.
Parameters:
depth - an int with the maximum depth to crawl.
Throws:
WcmException - if the depth cannot be set.

getInternalLinkDepth

public int getInternalLinkDepth()
                         throws WcmException
Deprecated. 
Get the maximum crawling depth for internal links set for the crawl.
Returns:
an int with the maximum depth to crawl for internal links.
Throws:
WcmException - if the depth cannot be retrieved.

setInternalLinkDepth

public void setInternalLinkDepth(int depth)
                          throws WcmException
Deprecated. 
Set the maximum crawling depth for internal links for the crawl.
Please note: using setFollowInternalLinksFlag(false) will be treated as a setFollowInternalLinksFlag(0).
Parameters:
depth - an int with the maximum depth to crawl for internal links.
Throws:
WcmException - if the depth cannot be set.

getFollowInternalLinksFlag

public boolean getFollowInternalLinksFlag()
                                   throws WcmException
Deprecated. 
Check if the flag to follow internal links is set.
Returns:
a boolean true if the flag is set and internal links should be crawled, false if not.
Throws:
WcmException - if the flag cannot be retrieved.

setFollowInternalLinksFlag

public void setFollowInternalLinksFlag(boolean flag)
                                throws WcmException
Deprecated. 
Set the flag to follow internal links.
Parameters:
flag - a boolean true if internal links should be crawled, false if not.
Throws:
WcmException - if the flag cannot be set.

getExternalLinkDepth

public int getExternalLinkDepth()
                         throws WcmException
Deprecated. 
Get the maximum crawling depth for external links set for the crawl.
Returns:
an int with the maximum depth to crawl for external links.
Throws:
WcmException - if the depth cannot be retrieved.

setExternalLinkDepth

public void setExternalLinkDepth(int depth)
                          throws WcmException
Deprecated. 
Set the maximum crawling depth for external links for the crawl.
Please note: using setFollowExternalLinksFlag(false) will be treated as a setFollowExternalLinksFlag(0).
Parameters:
depth - an int with the maximum depth to crawl for external links.
Throws:
WcmException - if the depth cannot be set.

getFollowExternalLinksFlag

public boolean getFollowExternalLinksFlag()
                                   throws WcmException
Deprecated. 
Check if the flag to follow external links is set.
Returns:
a boolean true if the flag is set and external links should be crawled, false if not.
Throws:
WcmException - if the flag cannot be retrieved.

setFollowExternalLinksFlag

public void setFollowExternalLinksFlag(boolean flag)
                                throws WcmException
Deprecated. 
Set the flag to follow external links.
Parameters:
flag - a boolean true if external links should be crawled, false if not.
Throws:
WcmException - if the flag cannot be set.

getIncludeVersionsFlag

public boolean getIncludeVersionsFlag()
                               throws WcmException
Deprecated. 
Check if the flag to include versions of resources is set.
Returns:
a boolean true if the flag is set and all versions should also be crawled, false if not.
Throws:
WcmException - if the flag cannot be retrieved.

setIncludeVersionsFlag

public void setIncludeVersionsFlag(boolean flag)
                            throws WcmException
Deprecated. 
Set the flag to include versions of resources.
Parameters:
flag - a boolean true if all versions should be crawled too, false if not.
Throws:
WcmException - if the flag cannot be set.

getPropertyQuery

public IGenericQuery getPropertyQuery()
                               throws WcmException
Deprecated. 
Get the query expression for searching the properties.
Returns:
a IGenericQuery with the query expression for the property search, null if not set.
Throws:
WcmException - if the query cannot be retrieved.

setPropertyQuery

public void setPropertyQuery(IGenericQuery query)
                      throws WcmException
Deprecated. 
Set the query expression for searching properties.
Returns:
a IGenericQuery with the query expression for the property search, null if no property search should be performed.
Throws:
WcmException - if the query cannot be set.

getHierarchicalCrawlMode

public int getHierarchicalCrawlMode()
                             throws WcmException
Deprecated. 
Get the mode for a hierachical crawl.
Parameters:
an - int with the mode used for the crawl: HIERARCHICAL_OFF if crawling should follow the links from the HREF properties, HIERARCHICAL_ON to follow the hierarchical structure given by collections and resources, HIERARCHICAL_AUTO to use both mechanisms.
Throws:
WcmException - if the mode cannot be retrieved.

setHierachicalCrawlMode

public void setHierachicalCrawlMode(int mode)
                             throws WcmException
Deprecated. 
Set the hierarchical crawl mode.
Parameters:
mode - an int with the mode to use: HIERARCHICAL_OFF if crawling should follow the links from the HREF properties, HIERARCHICAL_ON to follow the hierarchical structure given by collections and resources, HIERARCHICAL_AUTO to use both mechanisms.
Throws:
WcmException - if the mode cannot be set.

crawl

public void crawl()
           throws WcmException
Deprecated. 
Start a new crawl for the crawler's default id.
(same as crawl(null, false))
Throws:
WcmException - if an error occured.

crawl

public void crawl(java.lang.String crawlID)
           throws WcmException
Deprecated. 
Start a new crawl for a given id. (same as crawl(crawlID, false))
Parameters:
crawlID - a String with the id to use for this crawl. The crawlID is used to separate different delta crawling result sets and must not exceed 32 chars in length!
Throws:
WcmException - if an error occured.

crawl

public void crawl(java.lang.String crawlID,
                  boolean retry)
           throws WcmException
Deprecated. 
Start a crawl for a given id.
Parameters:
crawlID - a String with the id to use for this crawl. The crawlID is used to separate different delta crawling result sets and must not exceed 32 chars in length!
retry - a boolean true if the the crawler should retry a previous crawl (if such a previous crawl exists), false if a new crawl should be started.
Throws:
WcmException - if an error occured.

recrawl

public void recrawl(java.lang.String crawlID)
             throws WcmException
Deprecated. 
Start another crawl (delta crawl) for a given id.
Parameters:
crawlID - a String with the id to use for this crawl. The crawlID is used to separate different delta crawling result sets and must not exceed 32 chars in length!
Throws:
WcmException - if an error occured or the result receiver does not implement the ICrawlerDeltaResultReceiver interface.

getLastVisitedList

public ICrawlerVisitedList getLastVisitedList()
                                       throws WcmException
Deprecated. 
Get the list of visited resources for a given id.
Returns:
a ICrawlerVisitedList with the list of visited resources or null if this list is not available (either no crawl was performed for the given id or the visited list is not persistent and no crawl was performed since the last system restart).
Throws:
WcmException - if an error occured.

cancelCrawl

public void cancelCrawl()
                 throws WcmException
Deprecated. 
Cancel a running crawler.
Throws:
WcmException - if an error occured.

isCrawling

public boolean isCrawling()
Deprecated. 
Check, if the crawler is currently running (maybe suspended or resumed).
See Also:
isSuspended()

isStopping

public boolean isStopping()
Deprecated. 
Check, if the crawler is currently stopping (stopp was signaled but not yet finished).

isSuspended

public boolean isSuspended()
Deprecated. 
Check, if the crawler is currently suspended (running, but suspended). The following states are possible: not crawling -> crawler is currently not running crawling & not suspended -> crawler is currently running and crawling crawling & suspended -> crawler is running but suspended (waiting)

suspend

public void suspend()
             throws WcmException
Deprecated. 
Suspend a running crawler.
Throws:
WcmException - if an error occured.
See Also:
isSuspended()

resume

public void resume()
            throws WcmException
Deprecated. 
Resume a previously suspended running crawler.
Throws:
WcmException - if an error occured.
See Also:
isSuspended()

delete

public void delete()
            throws WcmException
Deprecated. 
Delete the crawler (tell the ICrawlerService that it won't be used any more).
Throws:
WcmException - if an error occured.

getNiceness

public int getNiceness()
                throws WcmException
Deprecated. 
Get the niceness factor of the crawler.
Returns:
an int with current niceness factor.
Throws:
WcmException - if the niceness factor cannot be retrieved.

setNiceness

public void setNiceness(int niceness)
                 throws WcmException
Deprecated. 
Set the niceness factor for the crawler.
Parameters:
niceness - factor an int with the niceness factor to use.
Throws:
WcmException - if the niceness factor cannot be set.
See Also:
getNiceness()

getPriority

public int getPriority()
                throws WcmException
Deprecated. 
Get the priority of the crawler.
Returns:
an int with current priority.
Throws:
WcmException - if the priority cannot be retrieved.

setPriority

public void setPriority(int priority)
                 throws WcmException
Deprecated. 
Set the priority for the crawler.
Parameters:
priority - an int with the priority to use.
Throws:
WcmException - if the priority cannot be set.
See Also:
getPriority()

getTimeLimit

public long getTimeLimit()
                  throws WcmException
Deprecated. 
Get the time limit of the crawler.
The limit determines the maximum running time of the crawler in msec. Values less or equal 0 are interpreted as unlimited running time.
Returns:
a long with the time limit for the crawler.
Throws:
WcmException - if the time limit cannot be retrieved.

setTimeLimit

public void setTimeLimit(long limit)
                  throws WcmException
Deprecated. 
Set the time limit for the crawler.
Parameters:
limit - a long with the time limit to use in msec.
Throws:
WcmException - if the timelimit cannot be set.
See Also:
getTimeLimit()

getCaseSensitiveFlag

public boolean getCaseSensitiveFlag()
                             throws WcmException
Deprecated. 
Get the case sensitive flag of the crawler.
Returns:
a boolen true if RIDs are treated case sensitive, false if not.
Throws:
WcmException - if the case sensitive flag cannot be retrieved.

setCaseSensitiveFlag

public void setCaseSensitiveFlag(boolean flag)
                          throws WcmException
Deprecated. 
Set the case sensitive flag for the crawler.
Parameters:
flag - a boolen true if RIDs are to be treated case sensitive, false if not.
Throws:
WcmException - if the case sensitive flag cannot be set.
See Also:
getCaseSensitiveFlag()

getContentSizeLimit

public long getContentSizeLimit()
                         throws WcmException
Deprecated. 
Get the content size limit of the crawler.
The limit determines the maximum content size of the resources the crawler should pass to the result receiver in byte. Values less or equal 0 are interpreted as unlimited size.
Returns:
a long with the content size for the crawler.
Throws:
WcmException - if the content size limit cannot be retrieved.

setContentSizeLimit

public void setContentSizeLimit(long limit)
                         throws WcmException
Deprecated. 
Set the content size limit for the crawler.
Parameters:
limit - a long with the content size limit to use in byte.
Throws:
WcmException - if the content size cannot be set.
See Also:
getContentSizeLimit()

supportsNavigation

public boolean supportsNavigation()
Deprecated. 
Check if this crawler supports onUp and onDown for a result receiver.
Returns:
a boolean true if onUp/onDown are supported, false if not.

supportsDelta

public boolean supportsDelta()
Deprecated. 
Check if this crawler supports the IDeltaResultReceiver interfaces.
Returns:
a boolean true if delta crawling is supported, false if not.

getCrawlID

public java.lang.String getCrawlID()
Deprecated. 
Get the id of the last crawl.
Returns:
a String with the ID that was used for the last crawl or null if never crawled before.

setToBackground

public boolean setToBackground(ICrawlerQueue queue)
Deprecated. 
Mark the crawler as a background task. If a queue is specified the crawler is put to that queue and any calls to crawl() or recrawl() will be delayed until the queue's method runCrawlers() is called.
Parameters:
queue - a ICrawlerQueue the crawler should be put to or null if it should be started as a thread on it's own.
Returns:
a boolean true if the crawler can be run in the given queue, false if not.

isBackground

public boolean isBackground()
Deprecated. 
Check if it's a background crawler.
Returns:
a boolean true if the crawler is a background crawler (either belongs to a queue or is a thread on it's own), false if not.

isBackgroundQueued

public boolean isBackgroundQueued()
Deprecated. 
Check if it's a background crawler and if the crawler is currently queued for execution (crawl() or recrawl() was called, but not yet executed).
Returns:
a boolean true if the crawler is a background crawler (belongs to a queue), crawl/ recrawl was called and the crawler is not yet running; false if not.

getBackgroundQueue

public ICrawlerQueue getBackgroundQueue()
Deprecated. 
Get the crawlers background queue (if it's a background crawler).
Returns:
the ICrawlerQueue the crawler belongs, null if it's not a background crawler or the crawler doesn't belong to a queue.

setTestMode

public void setTestMode(boolean flag)
Deprecated. 
Turn testmode on or off (in testmode the result reciever won't be called.
Parameters:
flag - a boolean true if testmode should be turned on, false if testmode is turned off (normal mode).

getTestMode

public boolean getTestMode()
Deprecated. 
Get the testmode flag.
Returns:
a boolean true if testmode is turned on, false if testmode is turned off (normal mode).

getDocumentsInAccess

public java.lang.String[] getDocumentsInAccess()
Deprecated. 
Get the documents the crawler is currently accessing.
Returns:
the documents the crawler is currently accessing

SAP NetWeaver '04

Copyright © 2004 by SAP AG. All Rights Reserved.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.

These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.