org.archive.wayback.replay.charset (OpenWayback 2.1.0 API)

Interface Summary
Interface Description

EncodingSniffer
A step of character encoding sniffing.

Interface Summary
Interface	Description
EncodingSniffer	A step of character encoding sniffing.

Class Summary
Class	Description
BaseEncodingSniffer	Implements common utility methods for EncodingSniffer.
ByteOrderMarkSniffer	`EncodingSniffer` that peek the content for Byte Order Mark bytes.
CharsetDetector	Abstract class containing common methods for determining the character encoding of a text Resource, most of which should be refactored into a Util package.
ContentTypeHeaderSniffer	`EncodingSniffer` obtaining character encoding from `Content-Type` HTTP header.
PrescanMetadataSniffer	`EncodingSniffer` that pre-scan byte stream for `<meta http-equiv="content-type" ... >` tag.
RotatingCharsetDetector
StandardCharsetDetector	`CharsetDetector` that roughly follows steps prescribed by WHAT-WG recommendation:, with following simplifications: no support for inheriting parent browsing context's character encoding (information is not readily available to Wayback) default is fixed to `UTF-8`, regardless of user's locale (a crawler's locale information is not readily available to Wayback) does not support confidence, thus does not support encoding switching (this is more about `CharsetDetector`'s design) CHANGE 1.8.1 2014-07-07: added BOM detection as the first step.
UniversalChardetSniffer	`EncodingSniffer` that runs `UniversalDetector` on the content.

Package org.archive.wayback.replay.charset