class documentation
class Fragments_HTML_Fallback(Fragments):
Constructor: Fragments_HTML_Fallback(docbytes, debug)
Extract text from HTML from non-specific source into fragments
| Method | __init__ |
Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide(). |
| Method | accepts |
whether we would consider parsing that at all. Often, "is this the right file type". |
| Method | fragments |
No metadata at all, just text split by |
| Method | suitableness |
Mostly just says we're a bad example but we'll try; our accepts() is the real filter here |
| Instance Variable | docbytes |
Undocumented |
| Instance Variable | etree |
Undocumented |
Inherited from Fragments:
| Instance Variable | debug |
Undocumented |
Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().
overrides
wetsuite.helpers.split.Fragments.acceptswhether we would consider parsing that at all. Often, "is this the right file type".