class documentation

class Fragments_XML_OP_Handelingen(Fragments):

View In Hierarchy

Turn handelingen in XML form (from KOOP's BUS) into fragments

Method __init__ Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().
Method accepts whether we would consider parsing that at all. Often, "is this the right file type".
Method fragments yields a tuple for each fragment
Method suitableness e.g.
Instance Variable docbytes Undocumented
Instance Variable startpaths Undocumented
Instance Variable tree Undocumented

Inherited from Fragments:

Instance Variable debug Undocumented
def __init__(self, docbytes, debug=False):

Hand the document bytestring into this. Nothing happens yet; you call accepts(), then suitableness(), then possibly fragments() -- see example use in decide().

def accepts(self):

whether we would consider parsing that at all. Often, "is this the right file type".

def fragments(self):

yields a tuple for each fragment

def suitableness(self):

e.g.

  • 5: I recognize that's PDF, from OP, and specifically Stcrt so I probably know how to fetch out the text fairly well
  • 50: I recognize that's PDF, from OP, so I may do better than entirely generic
  • 500: I recognize that's PDF, I will do something generic (because I am a fallback for PDFs)
  • 5000: I recognize that's PDF, but I'm specific and it's probably a bad idea if I do something generic The idea is that with multiple of these, we can find the thing that (says) is most specific to this document.
docbytes =
startpaths =

Undocumented

tree =

Undocumented