public class StandardCharsetDetector extends CharsetDetector
CharsetDetector
that roughly follows steps prescribed by
WHAT-WG recommendation:,
with following simplifications:
UTF-8
, regardless of user's locale (a crawler's
locale information is not readily available to Wayback)CharsetDetector
's
design)CHANGE 1.8.1 2014-07-07: added BOM detection as the first step.
DEFAULT_CHARSET
Constructor and Description |
---|
StandardCharsetDetector() |
Modifier and Type | Method and Description |
---|---|
String |
getCharset(Resource httpHeadersResource,
Resource payloadResource,
WaybackRequest wbRequest) |
getCharset
public String getCharset(Resource httpHeadersResource, Resource payloadResource, WaybackRequest wbRequest) throws IOException
getCharset
in class CharsetDetector
httpHeadersResource
- resource with http headers to considerpayloadResource
- resource with payload to consider (presumably text)nul
IOException
- if there are problems reading the ResourceCopyright © 2005–2015 IIPC. All rights reserved.