0

Say for instance I had a data set that was self describing. The first few well-structured records define data type IDs, which include the name and length of records, followed by content records, which start with the data IDs and contain a variable amount of data, depending on the ID.

It would be easy enough to describe the definition records using BNF, EBNF, or ABNF .. but how would one concisely describe the content records, whose length is defined in the definition records?

Here is an example of describing the classic NetCDF data format with a BNF-like notation, but not concisely because the lengths of the data recs is not specified as a function of data in the the earlier dim and var definitions.

Mike Godin
  • 3,727
  • 3
  • 27
  • 29

1 Answers1

0

Are you asking how to define the content of the content records? You made it clear that they're already defined in terms of the amount of data. If each data type ID implies not only a data length but also a data structure, it's straightforward, even in BNF, with one set of productions for each data type ID. Is that what you mean? (It's even likely to be LR(1).)

I am the creator of an Expert System, named XTRAN, that manipulates over 30 computer languages, as well as data and text. I got tired of writing parsers, so I created a parsing engine that executes EBNF at parse time, and I feed it the EBNF via the Expert System's rules language. Since EBNF itself is meta, the schema I use to parse and store it for execution at parse time is meta-meta.

XTRAN's rules language also provides a data base capability in which a data base is in-memory, content-addressable, and stored as a sparse matrix. It's effectively an n-space, with each cell addressed via a list of subscripts, with each subscript being either elided, an integer, or a text string. So I can construct the scenario you describe quickly, by storing the data descriptions in the same data base that contains the content records. It's loosely analogous to a relational data base describing its schema via its own contents.

FWIW, we call XTRAN's rules language meta-code, because it's a language that can manipulate other languages (as well as itself).