public class HRobotExclusionFilter extends ExclusionFilter
filterGroup
FILTER_ABORT, FILTER_EXCLUDE, FILTER_INCLUDE
Constructor and Description |
---|
HRobotExclusionFilter(LiveWebCache webCache,
String userAgent,
long maxCacheMS)
Construct a new HRobotExclusionFilter that uses webCache to pull
robots.txt documents. filtering is based on userAgent, and cached
documents newer than maxCacheMS in the webCache are considered valid.
|
Modifier and Type | Method and Description |
---|---|
int |
filterObject(CaptureSearchResult r)
inpect record and determine if it should be included in the
results or not, or if processing of new records should stop.
|
setFilterGroup
public HRobotExclusionFilter(LiveWebCache webCache, String userAgent, long maxCacheMS)
webCache
- LiveWebCache from which documents can be retrieveduserAgent
- String user agent to use for requests to the live web.maxCacheMS
- long number of milliseconds to cache documents in the
LiveWebCachepublic int filterObject(CaptureSearchResult r)
ObjectFilter
r
- Object which should be checked for inclusion/exclusion or abortCopyright © 2005–2017 IIPC. All rights reserved.