class documentation
class Fragments_HTML_Fallback(Fragments):
Extract text from HTML from non-specific source into fragments
Method | __init__ |
Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide(). |
Method | accepts |
whether we would consider parsing that at all. Often, "is this the right file type". |
Method | fragments |
No metadata at all, just text split by |
Method | suitableness |
Mostly just says we're a bad example but we'll try; our accepts() is the real filter here |
Instance Variable | docbytes |
Undocumented |
Instance Variable | etree |
Undocumented |
Inherited from Fragments
:
Instance Variable | debug |
Undocumented |
Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().
overrides
wetsuite.helpers.split.Fragments.accepts
whether we would consider parsing that at all. Often, "is this the right file type".