Since it was suggested to use (MZ) twin studies to prove causation... that won't completely work if done observationally, e.g.:
While MZ co-twin control studies can provide more accurate estimates of the returns to education than analyses of single individuals, these studies do not entirely obviate the need to control for differences between university students and non-students that predate university attendance and might account for income differentials and even non-monetary outcomes.
So you'd have to keep the twins in the same environment right before the separation of interest happens... which raises serious doubts about feasibility. For instance:
If it can be established that MZ twins are significantly different on dimensions related to university attendance that are also related to later life outcomes, then the MZ co-twin control method will not completely account for relevant confounders in estimating the returns to education, both financial and social.
Anyway, even for larger (and easier to measure) educational differences than merely good/bad high-school, the MZ twin differences are fairly small:
Further, when MZ twins who are discordant for level of education are used to estimate the returns to education by comparing the income of the twin with more education to the income of the twin with less education, researchers often report estimates around 5% (Miller et al., 2006). Estimates of the returns to education vary based on the cohort examined, level of country development, demographic variables included in the model, types of corrections made to adjust for errors in measurement, and numerous other variables, but in all, the estimates often fall in the 3–15% range (Ashenfelter et al., 1999).
Although you've narrowed your question to the US, ironically I was only able to find one MZ study about educational quality in Japan around the age you're interested in:
Using information on 1045 pairs of Japanese monozygotic twins, we examined differences in education by considering both the years of schooling (quantity) and the reputation of the last attended school (quality). We found that a difference in learning performance at 15 years of age is one of the key factors determining the differences [...]
Elsewhere, Ono (2007) shows that placement in a higher-quality college in
Japan has led to higher earnings. All other things being equal, in CTKJV [China, Taiwan,
South Korea, Japan and Vietnam], the
higher the school ranking, the more difficult it is to gain entry, and, therefore, only
students who are more competitive and/or the subject of greater educational
investment succeed in the required entrance exams
Therefore, investment in a
child’s education, such as private tutoring (Ono, 2007; Dang, 2007; Ryu and
Kang, 2013), especially in early life, has become important. Thus, parental
decisions could have a greater influence on the educational outcomes of children
in these countries
Which is why also mentioned in my comments that the [structure of the] country ['s educational system] also makes a difference.
Unlike in the abstract, in their discussion of the results, they authors of the Japanese study also emphasize what I said earlier about he difficulty in drawing conclusion because it's not clear from MZ studies what led to the difference in the high school quality they went to:
Differences in learning performance at age 15 and educational attainment In
both the OLS and the probit models, we found that difference in learning performance
at age 15 was the more influential factor in explaining the educational
differences between MZ twins. In all estimations, the sign of difference in learning
performance at age 15 was always positive when statistically significant,
implying that better learning performance at age 15 led to higher educational
attainment/investment. Interestingly, this challenges the conventional hypothesis
that MZ twins are strictly identical. By the age of 15, there were significant differences
between twins in their learning performance. This finding suggests the need
for further study to investigate the development of variances between twins
throughout their lifetimes.
As to the more general controversy, the introduction of another Japanese study, ironically is mostly about the US school-quality impact studies, but not strictly high school:
Since the Coleman Report, a congressional report on educational equality in the
United States, was released in 1966 (Coleman et al. 1966), one of the challenges for
economists in understanding more efficient resource allocation in schools is the rigorous
measurement of the impact of school inputs, which is sometimes called “school quality,”
on student achievements. The reason this Report became controversial is that Coleman
and his colleagues showed that school quality was only weakly associated with student
achievements, after controlling for characteristics of parents and communities. The
conclusion derived from the Report has been widely publicized as “money (= investments
in school quality) does not matter” for education.
Subsequently, an influential analysis conducted by Hanushek (1989, 1997)
summarized a large body of empirical evidence on the relationship between school inputs
and student achievements, by counting the results on each side of the debate. Because
there were more studies that reported an insignificant effect of school expenditures than
those showing a positive effect, Hanushek (1997) concluded his analysis with a very
often cited phrase, “there is not a strong or consistent relationship between student
performance and school resources” (p. 141). In other words, Hanushek supported the
central conclusion of the Coleman Report—school quality does not matter.
However, many economists have questioned Hanushek’s study. In particular, Hedges
et al. (1994) pointed out that Hanushek’s analysis was designed to give a small
probability of falsely detecting a significant effect of education inputs but not to avoid the
possibility of failing to detect such an effect if in fact it exists. Their meta-analysis, using
the same data as Hanushek, confirmed that school resources yielded general
improvement in student achievements, although publication bias remained a research
concern. Some experimental evaluations of school resources have also shown that school
resources do matter for education (Krueger 1999; Angrist & Lavy 1999; Case & Deaton
1999). More specifically, an assignment to a smaller class appeared to substantially raise
student achievements in Tennessee (Krueger 1999) and in Israel (Angrist & Lavy 1999),
while the same was true for a lower student–teacher ratio in South Africa (Case & Deaton
1999).
Further, Card & Krueger (1996) placed more emphasis on the education production
function, treating earnings as a dependent variable, rather than standardized test scores.
They claimed that test scores are good predictors of what students learned at school but
not their success in the labor market in later life. Their survey concluded that there is a
significant relationship between school resources and earnings. However, fewer studies
have explored whether school quality has a significant impact on both subsequent
earnings and student achievements, and it remains a paradox that many studies find only a
weak link between earnings and student achievements (e.g., Murane, Willet & Levy
1995).
It is noteworthy that, in many cases, the effect size of school resources is substantially
larger for students from low-income families and for minorities. Taken as a whole, a
consensus has emerged from more recent research that school quality raises student
achievements, particularly for students from low income families, although Hanushek’s
(2003) argument: “commonly used input policies are almost certainly inferior to altered
incentives within schools” (p. 64) must deserve greater attention.
Measuring the effects of school quality rigorously is, however, difficult, owing to data
and methodological limitations. Most prior studies have used the variations in school
inputs across schools, which may create a methodological difficulty in estimating the
impact of those school inputs on student achievements. For example, parents with a
strong motivation for their children’s education are more likely to choose good schools
and are willing to pay for expensive tuition and other mandatory fees. Those parents are
more likely to be involved in parental associations and other political activities to
improve school quality. In such situations, a positive relationship between school
resources and student achievements may be attributable to unobserved parental characteristics. It is often difficult to isolate the effects of such unobservable factors from
those of school inputs. In other words, children who attend a school that is rich in
resources are probably more likely to have greater advantages than others before
schooling.
Some of the most
sophisticated studies to address selection bias are the experimental evaluations conducted
by Krueger (1999), Angrist & Lavy (1999) and Case & Deaton (1999), as mentioned
above. Krueger’s study was designed as a randomized experiment, while the
quasi-experimental studies of Angrist & Lavy (1999) and Case & Deaton (1999) took
advantage of the situation in which assignments to school inputs were randomly
determined by chance.
In addition to those experimental studies, some recent research has used samples of
siblings (Altonji & Dunn 1996; Lindahl & Regner 2005) and twins (Behrman et al. 1996),
which can be regarded to some extent as natural experiments. These studies attempted to
compare the difference in educational experiences between siblings or twins and to
control the common unobserved family endowments and/or genetic makeups. Behrman
et al. (1996) and Lindahl & Regner (2005) used college data, while Altonji & Dunn
(1996) focused on secondary school. There is a large body of research that compares
siblings or twins in different years of education (e.g., Ashenfelter & Krueger, 1994), but
there is little research that examines the effect of school quality.
So given the issues with MZ twin studies, let's try our luck with the more scope-limited, but hopefully better methodologically [quasi-]experimental studies cited above, namely:
- Angrist, J.D. & Lavy, V. (1999). Using Maimonides rule to estimate the effect of class
size on scholastic achievements. The Quarterly Journal of Economics, 114,
533-575.
The twelfth century rabbinic scholar Maimonides proposed a maximum class
size of 40. This same maximum induces a nonlinear and nonmonotonic relationship
between grade enrollment and class size in Israeli public schools today.
Maimonides’ rule of 40 is used here to construct instrumental variables estimates
of effects of class size on test scores. The resulting identification strategy can be
viewed as an application of Donald Campbell’s regression-discontinuity design to
the class-size question. The estimates show that reducing class size induces a
significant and substantial increase in test scores for fourth and fifth graders,
although not for third graders.
- Case, A. & Deaton, A. (1999). School inputs and educational outcomes in South Africa.
The Quarterly Journal of Economics, 114, 1047-1084.
We examine the relationship between educational inputs—primarily pupilteacher ratios—and school outcomes in South Africa immediately before the end of apartheid government. Black households were severely limited in their residential choice under apartheid and attended schools for which funding decisions were made centrally, by White-controlled entities over which they had no control. The allocations resulted in marked disparities in average class sizes. Controlling for household background variables, we find strong and significant effects of pupil-teacher ratios on enrollment, on educational achievement, and on test scores for numeracy.
- Krueger, A.B. (1999). Experimental estimates of education production functions. The
Quarterly Journal of Economics, 114, 497-532.
This paper analyzes data on 11,600 students and their teachers who were
randomly assigned to different size classes from kindergarten through third grade. Statistical methods are used to adjust for nonrandom attrition and transitions between classes. The main conclusions are (1) on average, performance on
standardized tests increases by four percentile points the rst year students attend small classes; (2) the test score advantage of students in small classes expands by about one percentile point per year in subsequent years; (3) teacher aides and measured teacher characteristics have little effect; (4) class size has a
larger effect for minority students and those on free lunch; (5) Hawthorne effects were unlikely.
So apparently smaller classes have better outcomes (but on measures that are fairly close temporally), that's the actual finding of the latter group of studies. Pretty far what you were asking, but only thing that can be easily established [quasi-]experimentally.