class documentation

class Fragments_HTML_Fallback(Fragments):

View In Hierarchy

Extract text from HTML from non-specific source into fragments

Method __init__ Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().
Method accepts whether we would consider parsing that at all. Often, "is this the right file type".
Method fragments No metadata at all, just text split by
Method suitableness Mostly just says we're a bad example but we'll try; our accepts() is the real filter here
Instance Variable docbytes Undocumented
Instance Variable etree Undocumented

Inherited from Fragments:

Instance Variable debug Undocumented
def __init__(self, docbytes, debug=False):

Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().

def accepts(self):

whether we would consider parsing that at all. Often, "is this the right file type".

def fragments(self):

No metadata at all, just text split by

def suitableness(self):

Mostly just says we're a bad example but we'll try; our accepts() is the real filter here

docbytes =
etree =

Undocumented