Reaching back to 2008. Yes. Decade(s) ago. Python 2.5.
I was reminded of this when a former colleague pinged me about this: https://github.com/slott56/TigerShark.
Yes, it's an X12/EDI message parsing library from -- well -- decades ago.
What is all this about?
Short answer: Parsing X12 EDI messages, which have an obscure-as-hell structure.
Long Answer: EDI (Electronic Data Interchange) is a way for business enterprises and government agencies to exchange data in well-defined formats. See https://www.edibasics.com/what-is-edi/.
It sounds so simple and generic. It's so old, it predates HTML, XML, JSON, etc. Therefore, the formats are -- well -- weird.
There's a "standard", X12, that defines these messages. See https://x12.org. See https://www.stedi.com/edi/x12.
But. The exchange of message used to be done through proprietary networks and software. Therefore the compliance with the standard is sometimes incomplete. (Remember, this is old.)
Back In The Day
Some history
- Python as Config Language -- Forget XML and INI files
- Two Python Config-File Design Patterns
- Configuration File Scalability -- Who Knew? (Revised)
- Technical Debt, the Cost of Cheap and "Get This Done ACAP"
- Synchronicity and Document Object Models
- POPO and GOPS - Plain Old Python Objects and Good Old Python Syntax
I wrote an X12 parser in Python.
It transforms X12 text into Plain Old Python Objects (POPO.)
Back in the day (2008) this was targeted for Python 2.5.
It's Time
Nowadays, this is does not need to be quite so complicated.
Modern Python has a few changes since release 2.5. Two are central to this project:
- type annotations
- classes maintain the order of the definitions
These are the backbone of dataclasses (and pydantic and attrs.)
I believe there are two parts to the rewrite.
- Create dataclass-like class definition for segments and loops. These generally come from the non-Python configuration files used elsewhere. The Python is built from this. Once.
- Create a generic parser protocol that can extract the segments, loops, and atomic fields from the X12 messages. This becomes a superclass feature of all X12 components.
This should be much simpler than the old version. Which was very complicated. The old release had two levels of interpretation of the X12 content:
- Generic segments and loops
- A Pythonic Façade over the generic structure
I think this was (and continues to be) a bad idea.
(Progress will be flaky. I have a book to write, also.)
First Things First
Some updates to reflect Python 3.11 and better GitHub practices. I'll make the documentation more visible as a first step. I may rewrite the diagrams to PlantUML, also.
Just a few small cleanups before throwing the entire thing away and beginning again.