module documentation

Fetches from the APIs provided by opendata.tweedekamer.nl

Described at https://opendata.tweedekamer.nl/documentatie/odata-api though so far we implement and use mostly the Atom/SyncFeed API, not the OData one.

The full information model is fairly complex, see https://opendata.tweedekamer.nl/documentatie/informatiemodel

The data almost certainly comes from a relational database and is exposed in basically the same way, with not only references but also many-to-many tables.

Our initial need was simple, so this only fetches a few parts, with no dependencies. If you want a much more complete implementation and pleasant presentation, look to https://github.com/openkamer/openkamer

It is unclear how to do certain things with this interface, e.g. list the items in a kamerstukdossier. (though we can get those via e.g. https://zoek.officielebekendmakingen.nl/dossier/36267)

Function entry_dicts No summary
Function fetch_all Fetches all feed items of a single soort.
Function fetch_resource Note that if these don't exist, they will cause a 500 Internal Server Error, which should get thrown as an exception(VERIFY)
Function merge_etrees Merges a list of documents (etree documents, as fetch_all gives you) into a single etree document. Tries to pick up only the interesting data.
Constant SYNCFEED_BASE Base URL for a few different fetches (mostly /Feed)
Variable resource_types Undocumented
Function _entry_dict_from_node Helper for entry_dicts.
def entry_dicts(feed_etree):
Parameters
feed_etreean etree object for a syncfeed list. ...mostly made for the output of merge_etrees.
Returns
A list of dicts, one for each <entry> nodes from that etree. Most values are strings, while e.g. links are (rel, url) pairs.
def fetch_all(soort='Persoon', break_actually=False, timeout=60):

Fetches all feed items of a single soort.

Returns items from what might be multiple documents, because this API has a "and here is a link for more items from the same search" feature. Keep in mind that for some categories of things, this can be a _lot_ of fetches and data.

Parameters
soortwhat object type to fetch everything for. For the available values, see e.g. https://opendata.tweedekamer.nl/documentatie/introductie Note that if you misspell the soort, it returns an empty list rather than erroring out.
break_actuallybreak after first fetch, mostly for faster debug and testing
timeoutUndocumented
Returns

a list of etree objects, which are also stripped of namespaces (atom for the wrapper, tweedekamer for <content>).

This is not immediately useful, and you probably want to feed this into merge_etrees to make a single large document (some types are hundreds of MByte, though).

def fetch_resource(resource_id):

Note that if these don't exist, they will cause a 500 Internal Server Error, which should get thrown as an exception(VERIFY)

def merge_etrees(trees):

Merges a list of documents (etree documents, as fetch_all gives you) into a single etree document. Tries to pick up only the interesting data.

SYNCFEED_BASE: str =

Base URL for a few different fetches (mostly /Feed)

Value
'https://gegevensmagazijn.tweedekamer.nl/SyncFeed/2.0/'
resource_types: tuple[str, ...] =

Undocumented

def _entry_dict_from_node(entry_node):

Helper for entry_dicts.

Given a single etree node (that came from an <entry>), returns the contained information in a dict. This is mostly key-value (elem.tag, elem.value) but flattens a few details.