public class DupeTimestampLastBestStatusFilter extends DupeTimestampBestStatusFilter
DupeTimestampBestStatusFilter
that returns
the last best capture instead of the first one.
Support of noCollapsePrefix
complicates processing, so this
may be slightly slower than DupeTimestampBestStatusFilter
.
Note that the semantics of writeLine(CDXLine)
is slightly
different from other processors. It returns 1 if some (non-pass-through)
CDX line, which is not necessarily the same as the CDX line passed as argument,
is written out. Count of ones would be one less than others.
Modifier and Type | Field and Description |
---|---|
protected org.archive.format.cdx.CDXLine |
bestLine
Keeps the best CDX line so far within a group.
|
protected List<org.archive.format.cdx.CDXLine> |
pendingPassThroughs
Keeps a list of CDXLines that matches
noCollapsePreifx , but
cannot be written yet because their timestamp s are larger than
bestLine.timestamp . |
bestHttpCode, lastTimestamp, noCollapsePrefix, timestampDedupLength
inner
Constructor and Description |
---|
DupeTimestampLastBestStatusFilter(BaseProcessor output,
int timestampDedupLength,
String[] noCollapsePrefix) |
Modifier and Type | Method and Description |
---|---|
void |
end()
Called at the end.
|
protected void |
flushPassThrough()
Write out all pending pass-throughs, and
clear pass-through buffer.
|
protected String |
groupKey(org.archive.format.cdx.CDXLine line)
return group key of
line . |
int |
writeLine(org.archive.format.cdx.CDXLine line)
Process
line . |
include, isBlocked, noCollapse, passThrough
begin, modifyOutputFormat, trackLine, writeResumeKey
protected org.archive.format.cdx.CDXLine bestLine
protected List<org.archive.format.cdx.CDXLine> pendingPassThroughs
noCollapsePreifx
, but
cannot be written yet because their timestamp
s are larger than
bestLine.timestamp
.public DupeTimestampLastBestStatusFilter(BaseProcessor output, int timestampDedupLength, String[] noCollapsePrefix)
protected final void flushPassThrough()
protected final String groupKey(org.archive.format.cdx.CDXLine line)
line
.line
- CDX linetimestampDedupLength
digits
of timestamp
public int writeLine(org.archive.format.cdx.CDXLine line)
BaseProcessor
line
.writeLine
in interface BaseProcessor
writeLine
in class DupeTimestampBestStatusFilter
line
- CDXLine
line
is sent to output, 0 otherwise.public void end()
BaseProcessor
end()
on nested processor.end
in interface BaseProcessor
end
in class WrappedProcessor
Copyright © 2005–2017 IIPC. All rights reserved.