What is the definition of a "valid program"?

Question

ISO/IEC 9899:202x (E) working draft — December 11, 2020 N2596, footnote 9:

... an implementation is free to produce any number of diagnostic messages, often referred to as warnings, as long as a valid program is still correctly translated. It can also successfully translate an invalid program.

Searching the definition of "valid / invalid program" across the standard gives no results. In fact the footnote 9 is the only place where "valid / invalid program" is mentioned.

Note: yes:

In ISO standards, notes are without exception non-normative.

Source: https://www.iso.org/schema/isosts/v1.0/doc/n-6ew0.html.

However, people do frequently use the term "valid / invalid program".

Can someone please help to suggest / deduce the definition (relative to the standard) of the term "valid program"?

The question may look silly at the first glance. However, there are cases when people have different understandings of the term "valid program". Hence, misinterpretations occur.

My guess: valid program -- a program which does not violate any syntax rule or constraint.

Note: "semantics rule" is intentionally not included in this definition because per Rice's theorem "non-trivial semantic properties of programs are undecidable".

Is such definition appropriate? If no, then what it the appropriate definition?

score 1 · Answer 1 · answered Feb 16 '22 at 21:21

1

At least in older versions of the Standard, a Conforming C Program is any source text which is accepted by at least one Conforming C Implementation somewhere in the universe. Given that conforming implementations are allowed to extend the language to accept almost any arbitrary source text, including programs that contain constraint violations, provided that they only accept the latter after having issued at least one diagnostic, the question of whether any particular source text is a Conforming C Program is determined by the existence or non-existence of implementations that accept it, rather than by any trait of the source text itself.

answered Feb 16 '22 at 21:21

supercat

77,689
9
166
211

It seems that "valid program", "invalid program" are non-standard terms. The standard ones are "strictly conforming program", "conforming program", "non-conforming program" (though not mentioned). – pmor Feb 18 '22 at 00:52
Meaning that a program P can be conforming (i.e. acceptable) for conforming implementation I1, but non-conforming (i.e. not acceptable) for conforming implementation I2. – pmor Feb 18 '22 at 01:02
@pmor: The Standard defines the term "strictly conforming program" narrowly enough to deliberately exclude any programs that wouldn't be supportable on all hosted implementations (it also excludes all non-trivial programs for freestanding implementations), and the term "conforming C program" broadly enough to avoid excluding any useful programs. There is no category of programs that should be expected to run on most implementations, though not necessarily all. – supercat Feb 18 '22 at 08:18

Ext3h · Answer 2 · 2021-11-06T10:53:56.607

-1

Your assumption that a valid program may not violate any constraints is correct. And so is your assumption that correctness is impossible hard to prove via static analysis, but can only be attested to a specific execution pass.

It's the definition of the "invalid program" which is fuzzy. A program can still be valid for a limited set of inputs, so you can't label the program invalid entirely. Only programs which are invalid for every possible input are invalid as a whole. Likewise, only a program which is valid for every possible input is "truly valid". In reality, there is hardly any non-trivial program which would not have edge cases where it's still invalid.

To sum that up into a formal definition:

A program is valid if there is at least a single possible input for which no constraints are violated.

A program is invalid only if it violates constraints for all possible inputs.

And please don't confuse valid/invalid with correct/incorrect. Criteria for the later is correctness for all possible inputs.

edited Nov 06 '21 at 10:53

answered Nov 06 '21 at 10:44

Ext3h

5,713
17
43

This definition is not very useful because any program that does not violate syntax rules or constraints can trivially be transformed to a valid (by your definition) program. Add `if (argc>=1001) return;` or some such nonsense at the beginning of the program and bam, there is input for which it violates nothing. Why should this program be treated differently from a program that lacks such check at the beginning? – n. m. could be an AI Nov 06 '21 at 11:22
Because it actually is a valid program for some input. It's the fine line between discarding all undefined behavior and its side effects, and still having to keep that one well defined path alive. I'm fully aware that's counterintuitive, but that's just how it works if the language definition is built around making stuff explicitly undefined instead of illegal. – Ext3h Nov 06 '21 at 11:39
A language with no undefined behavior doesn't have this issue. If it's syntactically correct, it's also valid. E.g. declaring a division by zero to throw an exception rather being undefined makes a huge difference. Same for defining the consequences of messing up memory management, and all the other issues languages like C# oder Java formally clarified. – Ext3h Nov 06 '21 at 11:44
Absence of undefined behaviour on uninteresting inputs is an uninteresting property. Sure you can define it, but what would you do with it? – n. m. could be an AI Nov 06 '21 at 12:09
“A program is valid if there is at least a single possible input for which no constraints are violated” makes validity undecidable as, if there were a function to determine validity, a program with no input could call that function to ask about itself and then violate a constraint if and only if the function reports the program is valid. – Eric Postpischil Nov 06 '21 at 12:36
Validity *is undecidable* for an abritrary program. That's not a mistake, but the rational behind the requirement that you must not reject any valid program, even if you can't decide. Which has the consequence that you have to accept even entirely invalid programs unless you can prove them as such. – Ext3h Nov 06 '21 at 12:52
Even if a program violates a constraint, an implementation would be allowed to issue a diagnostic and then process the program usefully anyhow. If there exists an implementation anywhere in the universe that would process a given source text in such fashion, the source text is a Conforming C Program. – supercat Feb 16 '22 at 21:11
@supercat Actually no, a C compiler implementation which ignores constraints (read as: It implements non-standard extensions) does not render the source conforming to any of the mainline C standards. That paragraph only expresses that implementing extensions doesn't render the implementation non-conforming as long as all valid programs are still accepted. – Ext3h Feb 17 '22 at 07:35
@Ext3h: From N1570 5,1,1,3 "A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined" From whence do you get the notion that an implementation would not be allowed to usefully process a program that contains constraint violations. This was the compromise reached between e.,g.... – supercat Feb 17 '22 at 15:51
@Ext3h: ...people who recognized that zero-sized arrays within structures were useful, and made it practical to do things that were otherwise not possible, and those who thought they were evil because they violated t heir sense of aesthetics. If a program used a zero-sized array, the compiler would issue a diagnostic, the programmer would ignore it, and everything would proceed as it should; Were it not for such compromises, the Standard would have been soundly rejected. – supercat Feb 17 '22 at 15:54
@Ext3h: The goal of the Standard wasn't to distinguish between valid and invalid programs, but rather between programs that *all* implementations would be required to process usefully, and everything else. It would have been useful for it to recognize categories of programs that many implementations should be expected to process usefully, but which could not be accommodated universally, but C89 deliberately avoided doing so because that would require imply that some useful programs were "invalid", or imply that programmers shouldn't feel any obligation to support obscure platforms. – supercat Feb 17 '22 at 16:10

What is the definition of a "valid program"?

2 Answers2