Fetches data from rechtspraak.nl's API at https://data.rechtspraak.nl/
Note that the data.rechtspraak.nl/uitspraken/zoeken API is primarily for ranges (see the description in search) as they do _not_ allow text searches like the web interface does.
(There is an API behind https://uitspraken.rechtspraak.nl/api/zoek that is actually much better, yet we're probably not intended to be used like this and there is no reason to assume this will not change over time)
Note that many of the the parse_* functions parse fixed or mostly-fixed lists that might be useful in supporting use, but the more interesting thing here is search.
Function | parse |
Parse uitspraak content XMLs - the type you get when you stick an ECLI onto https://data.rechtspraak.nl/uitspraken/content?id= |
Function | parse |
Parse the 'instanties' value list (which is probably mostly static) |
Function | parse |
Parse the 'buitenlandse instanties' value list (which is probably mostly static) |
Function | parse |
Parse the 'niet-nederpanse uitspraken' value list |
Function | parse |
Parse the 'proceduresoorten' value list (which is probably mostly static) |
Function | parse |
Parse the 'rechtsgebieden' value list (which is probably mostly static) the data of which seems to be a depth-2 tree. |
Function | parse |
Takes search result etree (as given by search()), and returns a list of dicts like: |
Function | search |
Post a search to the public API on data.rechtspraak.nl, based on a dict of parameters. |
Constant | BASE |
base URL for search as well as value lists |
Function | _para |
Given the open-rechtspraak XML, specifically the uitspraak or conclusie node under the root, tries to give text in paragraph-sized chunks at a time (actually determined by the document structure). |
Constant | _FORMELE |
Undocumented |
Constant | _INSTANTIES |
Undocumented |
Constant | _INSTANTIES |
Undocumented |
Constant | _NIET |
Undocumented |
Constant | _PROCEDURESOORTEN |
Undocumented |
Constant | _RECHTSGEBIEDEN |
Undocumented |
Parse uitspraak content XMLs - the type you get when you stick an ECLI onto https://data.rechtspraak.nl/uitspraken/content?id=
Tries to give you metadata and text (CONSIDER: separating those).
There is an example use in the notebook repo (e.g. dataset_intro_by_doing__rechtspraaknl_raw).
TODO: actually read the schema - see https://www.rechtspraak.nl/Uitspraken/paginas/open-data.aspx
Returns | |
a dict like |
Parse the 'instanties' value list (which is probably mostly static)
Returns | |
a list of flat dicts, with keys Naam, Afkorting, Type, BeginDate, Identifier, for example: {'Identifier': 'http://psi.rechtspraak.nl/AG DH', 'Naam': "Ambtenarengerecht 's-Gravenhage", 'Afkorting': 'AGSGR', 'Type': 'AndereGerechtelijkeInstantie', 'BeginDate': '1913-01-01'}, |
Parse the 'buitenlandse instanties' value list (which is probably mostly static)
Returns | |
a list of flat dicts, with keys Naam, Identifier, Afkorting, Type, BeginDate, for example: {'Identifier': 'http://psi.rechtspraak.nl/instantie/ES/#AudienciaNacionalNationaalHof', 'Naam': 'Audiencia Nacional (Nationaal Hof)', 'Afkorting': 'XX', 'Type': 'BuitenlandseInstantie', 'BeginDate': '1950-01-01'} |
Parse the 'niet-nederpanse uitspraken' value list
Returns | |
a list of items like: {'id': 'ECLI:CE:ECHR:2000:0921JUD003224096', 'ljn': ['AD4213']}, {'id': 'ECLI:EU:C:2000:679', 'ljn': ['AD4227']}, {'id': 'ECLI:EU:C:2000:689', 'ljn': ['AD4228']}, {'id': 'ECLI:EU:C:2001:112', 'ljn': ['AD4244', 'AL3652']}, |
Parse the 'proceduresoorten' value list (which is probably mostly static)
Returns | |
A list of flat dicts, with keys Naam, Identifier, for example: {'Identifier': 'http://psi.rechtspraak.nl/procedure#artikel81ROzaken', 'Naam': 'Artikel 81 RO-zaken'} |
Parse the 'rechtsgebieden' value list (which is probably mostly static) the data of which seems to be a depth-2 tree.
Returns | |
as a dict with items like: 'http://psi.rechtspraak.nl/rechtsgebied#bestuursrecht': ['Bestuursrecht'], and: 'http://psi.rechtspraak.nl/rechtsgebied#bestuursrecht_ambtenarenrecht': ['Ambtenarenrecht', 'Bestuursrecht'], Where
|
Takes search result etree (as given by search()), and returns a list of dicts like:
{ 'ecli': 'ECLI:NL:GHARL:2022:7129', 'title': 'ECLI:NL:GHARL:2022:7129, Gerechtshof Arnhem-Leeuwarden, 16-08-2022, 200.272.381/01', 'summary': 'some text made shorter for this docstring example', 'updated': '2023-01-01T13:29:23Z', 'link': 'https://uitspraken.rechtspraak.nl/InzienDocument?id=ECLI:NL:GHARL:2022:7129', 'xml': 'https://data.rechtspraak.nl/uitspraken/content?id=ECLI:NL:GHARL:2022:7129', }
Notes:
- 'xml' is augmented based on the ecli and does not come from the search results
- keys may be missing (in practice probably just summary?)
Post a search to the public API on data.rechtspraak.nl, based on a dict of parameters.
See also:
- https://www.rechtspraak.nl/SiteCollectionDocuments/Technische-documentatie-Open-Data-van-de-Rechtspraak.pdf
Note that when when you give it nonsensical parameters, like date=2022-02-30, the service won't return valid XML, so the XML parse raises an exception.
Parameters | |
params | parameters like:
These are handed to urlencode, so could be either a list of tuples, or a dict, but because you are likely to repeat variables to specify ranges, 'list of tuples' should be your habit, e.g.: [ ("modified", "2023-01-01), ("modified", "2023-01-05) ] |
Returns | |
etree object for the search (or raises an exception) CONSIDER: returning only the urls |
Given the open-rechtspraak XML, specifically the uitspraak or conclusie node under the root, tries to give text in paragraph-sized chunks at a time (actually determined by the document structure).
Mainly used by parse_content()
Undocumented
Value |
|
Undocumented
Value |
|
Undocumented
Value |
|