Welcome | Get started | Dive | Contribute | Topics | Reference | Changes | More
Schematron validation¶
Lino can generate the XML of a Peppol document but currrently is not able to validate it. Here is why.
All code snippets on this page (lines starting with >>>
) are being
tested as part of our development workflow. The following
snippet initializes a demo project to use throughout this page.
>>> from lino_book.projects.cosi1.startup import *
Let’s get our latest sales invoice and call XMLMaker.make_xml_file()
on it
(The following snippet is from Outbound documents, but here we
will focus on validation).
>>> ar = rt.login()
>>> qs = trading.VatProductInvoice.objects.filter(journal__ref="SLS")
>>> obj = qs.order_by("accounting_period__year", "number").last()
>>> obj
VatProductInvoice #177 ('SLS 15/2015')
We have an invoice and now we can call its XMLMaker.make_xml_file()
method
to render its XML file:
>>> xmlfile = obj.make_xml_file(ar)
Make .../cosi1/media/xml/2015/SLS-2015-15.xml from SLS 15/2015 ...
The jinja.XmlMaker.xml_validator_file()
points to the file
PEPPOL-EN16931-UBL.sch
, which is an unmodified copy from
https://docs.peppol.eu/poacc/billing/3.0/
>>> obj.xml_validator_file
PosixPath('.../lino_xl/lib/vat/XSD/PEPPOL-EN16931-UBL.sch')
Right now the jinja.XmlMaker.make_xml_file()
method does nothing when the
validator file ends with “.sch”. Because we didn’t yet find a way to run
Schematron validation under Python. If you look at the code, you can see that we
tried lxml
and saxon.
The third and most promising method is tested in the following snippet. It is Robbert Harms’ pyschematron package.
The tests in this document are skipped unless you have pyschematron
installed.
>>> from importlib.util import find_spec
>>> if not find_spec('pyschematron'):
... pytest.skip('this doctest requires pyschematron')
>>> from pyschematron import validate_document
>>> from lxml import etree
>>> result = validate_document(xmlfile, obj.xml_validator_file)
>>> result.is_valid()
True
>>> print(etree.tostring(result.get_svrl(), pretty_print=True).decode(), end='')
...
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:xs="http://www.w3.org/2001/XMLSchema" schemaVersion="iso">
<svrl:metadata xmlns:dct="http://purl.org/dc/terms/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:pysch="https://github.com/robbert-harms/pyschematron">
<dct:creator>
<dct:agent>
<skos:prefLabel>PySchematron 1.1.6</skos:prefLabel>
</dct:agent>
</dct:creator>
<dct:created>...</dct:created>
<dct:source>
<rdf:Description>
<dct:creator>
<dct:Agent>
<skos:prefLabel>PySchematron 1.1.6</skos:prefLabel>
</dct:Agent>
</dct:creator>
<dct:created>...</dct:created>
</rdf:Description>
</dct:source>
</svrl:metadata>
</svrl:schematron-output>
We join Robbert when he writes in his README file: “In the future we hope to expand this library with an XSLT transformation based processing. Unfortunately XSLT transformations require an XSLT processor, which is currently not available in Python for XSLT >= 2.0.”
There are other people who would like to validate XML using Schematron in Python without needing a Java machine.