|
SAP NetWeaver '04 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
Parameters determining the behaviour of a crawl.
Copyright (c) SAP AG 2003
| Inner Class Summary | |
static class |
IXCrawlerParameters.LogLevel
Log levels for crawler log files |
static class |
IXCrawlerParameters.ModificationCheckMode
Modes for checking whether a resource was modified |
| Method Summary | |
boolean |
getCrawlHidden()
Check, whether hidden resources are included in the crawl. |
boolean |
getCrawlSystem()
Check, whether system resources are included in the crawl. |
boolean |
getCrawlVersions()
Check, whether versions of resources are included in the crawl. |
java.lang.String |
getDescription()
Get the description of the parameter set. |
long |
getDocumentTimeoutInSeconds()
Get the document timeout in seconds. |
int |
getErrorCacheCapacity()
Get the capacity of the cache for the error-set. |
IPropertyName |
getExcludedHrefPropertyName()
Get the name of the property which holds the HREFs of a resource from a web-repository which are restricted by robot-rules. |
boolean |
getFindAllDocsInDepth()
Check, whether resources are found on the shorted possible path (there may be multiple paths in a web-repository). |
int |
getFinishedCacheCapacity()
Get the capacity of the cache for the finished-set. |
boolean |
getFollowLinks()
Check, whether links are followed. |
int |
getFoundCacheCapacity()
Get the capacity of the cache for the found-set. |
IPropertyName |
getHrefPropertyName()
Get the name of the property which holds the HREFs of a resource from a web-repository. |
java.lang.String |
getLogFilePath()
Get the path to the crawler log file. |
int |
getMaxBacklogFiles()
Get the maximum number of old crawler log files. |
int |
getMaxDepth()
Get the maximum depth of the crawl process (0 is unlimited). |
long |
getMaxLogFileSizeInBytes()
Get the maximum size of the crawler log file in bytes. |
IXCrawlerParameters.LogLevel |
getMaxLogLevel()
Get the maximum log level. |
IXCrawlerParameters.ModificationCheckMode |
getModificationCheckMode()
Get the mode for checking whether a resource was modified. |
int |
getOldCacheCapacity()
Get the capacity of the cache for the old-set. |
int |
getPostprocessedCacheCapacity()
Get the capacity of the cache for the postprocessed-set. |
int |
getPostprocessingCacheCapacity()
Get the capacity of the cache for the postprocessing-set. |
int |
getProviderCount()
Get the number of provider threads. |
int |
getProvidingCacheCapacity()
Get the capacity of the cache for the providing-set. |
long |
getRequestDelayInMilliseconds()
Get the number of milliseconds every crawler thread waits after retrieving a resource from a repository to reduce the load on the underlying persistency (e.g. database) or channel (e.g. network). |
boolean |
getRespectRobots()
Check, whether the robot-rules of web-servers are respected. |
com.sapportals.wcm.service.resourcefilter.IResourceFilter[] |
getResultFilters()
Get the resource filters which are applied to the result of the crawl but do not narrow the scope. |
int |
getRetrieverCount()
Get the number of retriever threads. |
int |
getRetrievingCacheCapacity()
Get the capacity of the cache for the retrieving-set. |
com.sapportals.wcm.service.resourcefilter.IResourceFilter[] |
getScopeFilters()
Get the resource filters which narrow the scope of the crawl. |
long |
getSleepDistanceInMilliseconds()
Get the number of milliseconds between two sleep-periods of a crawler-thread. |
long |
getSleepDurationInMilliseconds()
Get the duration of a sleep-period of a crawler-thread in milliseconds. |
boolean |
getTest()
Check, whether the crawler runs in test-mode (no passing of results to the result receivers). |
int |
getTodoCacheCapacity()
Get the capacity of the cache for the todo-set. |
boolean |
getUseChecksum()
Check, whether a checksum is used to determine whether a resource has changed. |
boolean |
getUseETag()
Check, whether the ETag is used to determine whether a resource has changed. |
| Method Detail |
public java.lang.String getDescription()
public int getMaxDepth()
public int getRetrieverCount()
public int getProviderCount()
public boolean getUseChecksum()
public boolean getUseETag()
public boolean getFollowLinks()
public boolean getCrawlVersions()
public boolean getCrawlHidden()
public boolean getCrawlSystem()
public IXCrawlerParameters.ModificationCheckMode getModificationCheckMode()
public long getRequestDelayInMilliseconds()
public boolean getFindAllDocsInDepth()
public boolean getRespectRobots()
public boolean getTest()
public com.sapportals.wcm.service.resourcefilter.IResourceFilter[] getScopeFilters()
public com.sapportals.wcm.service.resourcefilter.IResourceFilter[] getResultFilters()
public IPropertyName getHrefPropertyName()
public IPropertyName getExcludedHrefPropertyName()
public int getTodoCacheCapacity()
public int getRetrievingCacheCapacity()
public int getFoundCacheCapacity()
public int getProvidingCacheCapacity()
public int getFinishedCacheCapacity()
public int getOldCacheCapacity()
public int getPostprocessingCacheCapacity()
public int getPostprocessedCacheCapacity()
public int getErrorCacheCapacity()
public long getSleepDistanceInMilliseconds()
public long getSleepDurationInMilliseconds()
public long getMaxLogFileSizeInBytes()
public int getMaxBacklogFiles()
public java.lang.String getLogFilePath()
public IXCrawlerParameters.LogLevel getMaxLogLevel()
public long getDocumentTimeoutInSeconds()
|
SAP NetWeaver '04 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||