dateutil.parser.parserinfo
wetsuite.helpers.date.DutchParserInfo
- specific configuration for dateutil for dutch month and week nameswetsuite.datacollect.koop_frbr.FRBRFetcher
- Helper class to fetch data from an area of https://repository.overheid.nl/frbr/ See the constructor's docstring for more. In theory we could use the bulk service to do functionally the same, which is more efficient for both sides, yet almost no SSH tool seems to be able to negotiate with the way they configured it (SFTP imitating anonymous FTP, which is a grea idea in theory).wetsuite.datacollect.sru.SRUBase
- Very minimal SRU implementation - just enough to access the KOOP repositories.wetsuite.datacollect.koop_sru.BWB
- SRU endpoint for the Basis Wetten Bestand repositorywetsuite.datacollect.koop_sru.CVDR
- SRU endpoint for the CVDR (Centrale Voorziening Decentrale Regelgeving) repositorywetsuite.datacollect.koop_sru.EuropeseRichtlijnen
- Note: Broken/untestedwetsuite.datacollect.koop_sru.LokaleBekendmakingen
- SRU endpoint for bekendmakingen repositorywetsuite.datacollect.koop_sru.OfficielePublicaties
- SRU endpoint for the OfficielePublicaties repositorywetsuite.datacollect.koop_sru.PLOOI
- SRU endpoint for the Platform Open Overheidsinformatie repositorywetsuite.datacollect.koop_sru.PUCOpenData
- Publicatieplatform UitvoeringsContent https://puc.overheid.nl/wetsuite.datacollect.koop_sru.SamenwerkendeCatalogi
- SRU endpoint for the Samenwerkende Catalogi repositorywetsuite.datacollect.koop_sru.StatenGeneraalDigitaal
- SRU endpoint for Staten-Generaal Digitaal repositorywetsuite.datacollect.koop_sru.TuchtRecht
- SRU endpoint for the TuchtRecht repositorywetsuite.datacollect.koop_sru.WetgevingsKalender
- SRU endpoint for wetgevingskalender, see e.g. https://wetgevingskalender.overheid.nl/wetsuite.datasets.Dataset
- If you're looking for details about the specific dataset, look at the .descriptionwetsuite.helpers.collocation.Collocation
- A basic collocation calculator class.wetsuite.helpers.etree.debug_color
- Takes XML, parses, reindents, strip_namespaces, returns a class that will render it in color in a jupyter notebook (using pygments).wetsuite.helpers.localdata.LocalKV
- A key-value store backed by a local filesystem - it's a wrapper around sqlite3.wetsuite.helpers.localdata.MsgpackKV
- Like localKV but the value can be a nested python type (serialized via msgpack)wetsuite.helpers.notebook.etree_visualize_selection
- Produces a colorized representation of selection within an XML document. (works only within IPython/jupyter style notebooks, via a HTML representation.)wetsuite.helpers.notebook.ProgressBar
- A sequence-iterating progress bar (like tqdm) that supports both notebooks and console, and prefers notebook over console style in notebooks.wetsuite.helpers.spacy.notebook_content_visualisation
- Python notebook visualisation to give some visual idea of contents: marks out-of-vocabulary tokens red, and highlight the more interesting words (by POS).wetsuite.helpers.split.Fragments
- Abstractish base class explaining the purpose of implementing thiswetsuite.helpers.split.Fragments_HTML_BUS_kamer
- Turn kamer-related HTMLs (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_CVDR
- Turn CVDR in HTML form into fragmentswetsuite.helpers.split.Fragments_HTML_Fallback
- Extract text from HTML from non-specific source into fragmentswetsuite.helpers.split.Fragments_HTML_Geschillencommissie
- Turn HTML pages from degeschillencommissie.nl into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Bgr
- Turn blad gemeenschappelijke regeling in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Gmb
- Turn gemeenteblad in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Prb
- Turn provincieblad in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Stb
- Turn staatsblad in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Stcrt
- Turn staatscourat in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Trb
- Turn tractatenblad in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_OP_Wsb
- Turn waterschapsblad in HTML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_HTML_Tuchtrecht
- Turn HTML pages from into fragmentswetsuite.helpers.split.Fragments_PDF_Fallback
- Extract text from PDF from non-specific source into fragmentswetsuite.helpers.split.Fragments_XML_BUS_Kamer
- Turn other kamer XMLs (from KOOP's BUS) into fragments (TODO: re-check which these are)wetsuite.helpers.split.Fragments_XML_BWB
- Turn BWB in XML form into fragmentswetsuite.helpers.split.Fragments_XML_CVDR
- Turn CVDR in XML form into fragmentswetsuite.helpers.split.Fragments_XML_Fallback
- Extract text from XML from non-specific source into fragmentswetsuite.helpers.split.Fragments_XML_OP_Bgr
- Turn blad gemeenschappelijke regeling in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Gmb
- Turn gemeenteblad in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Handelingen
- Turn handelingen in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Prb
- Turn provincieblad in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Stb
- Turn sstaatsblad in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Stcrt
- Turn staatscourant in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Trb
- Turn tractatenblad in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_OP_Wsb
- Turn waterschapsblad in XML form (from KOOP's BUS) into fragmentswetsuite.helpers.split.Fragments_XML_Rechtspraak
- turn rechtspraak.nl's open-rechtspraak XML form into fragmentswetsuite.helpers.split.SplitDebug
- A notebook-style formatter that does little more than take a list of tuple of three things (meant for the output of fragments()), and print them in a table.