Welcome | Get started | Dive | Contribute | Topics | Reference | Changes | More
Schematron validation¶
Lino can generate the XML of a Peppol document but currrently is not able to validate it. Here is why.
Note
Code snippets in this document (lines starting with >>>
) get tested
as part of our development workflow. The following
initialization snippet tells you which demo project is being used.
>>> from lino_book.projects.cosi1.startup import *
Let’s get our latest sales invoice and call XMLMaker.make_xml_file()
on it
(The following snippet is from Outbound documents, but here we
will focus on validation).
>>> ar = rt.login()
>>> qs = trading.VatProductInvoice.objects.filter(journal__ref="SLS")
>>> obj = qs.order_by("accounting_period__year", "number").last()
>>> obj
VatProductInvoice #177 ('SLS 15/2015')
We have an invoice and now we can call its XMLMaker.make_xml_file()
method
to render its XML file and then validate it:
>>> xmlfile, url = obj.make_xml_file(ar)
Make .../cosi1/media/xml/2015/SLS-177.xml from SLS 15/2015 ...
Validate SLS-177.xml against .../lino_xl/lib/vat/XSD/PEPPOL-EN16931-UBL.sch ...
We can see that the jinja.XmlMaker.xml_validator_file()
uses the file
PEPPOL-EN16931-UBL.sch
, which is an unmodified copy from
https://docs.peppol.eu/poacc/billing/3.0/
The logger message lies a bit, right now the
jinja.XmlMaker.make_xml_file()
method does nothing when the validator file
ends with “.sch”. This is because we didn’t yet find a way to run Schematron
validation under Python. If you look at the code, you can see that we tried
lxml
and saxon.
The third and most promising method is tested in the following snippet. It is Robbert Harms’ pyschematron package.
>>> from importlib.util import find_spec
>>> if not find_spec('pyschematron'):
... pytest.skip('this doctest requires pyschematron')
>>> from pyschematron import validate_document
>>> from lxml import etree
>>> result = validate_document(xmlfile, obj.xml_validator_file)
>>> result.is_valid()
True
>>> print(etree.tostring(result.get_svrl(), pretty_print=True).decode(), end='')
...
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:xs="http://www.w3.org/2001/XMLSchema" schemaVersion="iso">
<svrl:metadata xmlns:dct="http://purl.org/dc/terms/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:pysch="https://github.com/robbert-harms/pyschematron">
<dct:creator>
<dct:agent>
<skos:prefLabel>PySchematron 1.1.6</skos:prefLabel>
</dct:agent>
</dct:creator>
<dct:created>...</dct:created>
<dct:source>
<rdf:Description>
<dct:creator>
<dct:Agent>
<skos:prefLabel>PySchematron 1.1.6</skos:prefLabel>
</dct:Agent>
</dct:creator>
<dct:created>...</dct:created>
</rdf:Description>
</dct:source>
</svrl:metadata>
</svrl:schematron-output>
We join Robbert when he writes in his README file: “In the future we hope to expand this library with an XSLT transformation based processing. Unfortunately XSLT transformations require an XSLT processor, which is currently not available in Python for XSLT >= 2.0.”
There are other people who would like to validate XML using Schematron in Python without Java.