wetsuite.datacollect.tweedekamer

module documentation

Fetches from the APIs provided by opendata.tweedekamer.nl

Described at https://opendata.tweedekamer.nl/documentatie/odata-api though so far we implement and use mostly the Atom/SyncFeed API, not the OData one.

The full information model is fairly complex, see https://opendata.tweedekamer.nl/documentatie/informatiemodel

The data almost certainly comes from a relational database and is exposed in basically the same way, with not only references but also many-to-many tables.

Our initial need was simple, so this only fetches a few parts, with no dependencies. If you want a much more complete implementation and pleasant presentation, look to https://github.com/openkamer/openkamer

It is unclear how to do certain things with this interface, e.g. list the items in a kamerstukdossier. (though we can get those via e.g. https://zoek.officielebekendmakingen.nl/dossier/36267)

Function	`entry_dicts`	No summary
Function	`fetch_all`	Fetches all feed items of a single soort.
Function	`fetch_resource`	Note that if these don't exist, they will cause a 500 Internal Server Error, which should get thrown as an exception(VERIFY)
Function	`merge_etrees`	Merges a list of documents (etree documents, as fetch_all gives you) into a single etree document. Tries to pick up only the interesting data.
Constant	`SYNCFEED_BASE`	Base URL for a few different fetches (mostly /Feed)
Variable	`resource_types`	Undocumented
Function	`_entry_dict_from_node`	Helper for `entry_dicts`.

def entry_dicts(feed_etree): ¶

Parameters
feed_etree	an etree object for a syncfeed list. ...mostly made for the output of `merge_etrees`.
Returns
A list of dicts, one for each <entry> nodes from that etree. Most values are strings, while e.g. links are (rel, url) pairs.

def fetch_all(soort='Persoon', break_actually=False, timeout=60): ¶

Fetches all feed items of a single soort.

Returns items from what might be multiple documents, because this API has a "and here is a link for more items from the same search" feature. Keep in mind that for some categories of things, this can be a _lot_ of fetches and data.

Parameters
soort	what object type to fetch everything for. For the available values, see e.g. https://opendata.tweedekamer.nl/documentatie/introductie Note that if you misspell the soort, it returns an empty list rather than erroring out.
break_actually	break after first fetch, mostly for faster debug and testing
timeout	Undocumented
Returns
a list of etree objects, which are also stripped of namespaces (atom for the wrapper, tweedekamer for <content>). This is not immediately useful, and you probably want to feed this into `merge_etrees` to make a single large document (some types are hundreds of MByte, though).

def fetch_resource(resource_id): ¶

Note that if these don't exist, they will cause a 500 Internal Server Error, which should get thrown as an exception(VERIFY)

def merge_etrees(trees): ¶

Merges a list of documents (etree documents, as fetch_all gives you) into a single etree document. Tries to pick up only the interesting data.

SYNCFEED_BASE: str = ¶

Base URL for a few different fetches (mostly /Feed)

Value

'https://gegevensmagazijn.tweedekamer.nl/SyncFeed/2.0/'

resource_types: tuple[str, ...] = ¶

Undocumented

def _entry_dict_from_node(entry_node): ¶

Helper for entry_dicts.

Given a single etree node (that came from an <entry>), returns the contained information in a dict. This is mostly key-value (elem.tag, elem.value) but flattens a few details.

wetsuite.datacollect.tweedekamer_nl

`wetsuite.datacollect.tweedekamer_nl`