The claim is based on this: http://ebn.bmj.com/content/8/2/39.full
bmjupdates+ uses the same, explicit and reproducible quality filters as Evidence-Based Medicine (http://hiru.mcmaster.ca/ebmj/Ebmp_p.htm) and Evidence-Based Nursing. Applying these criteria to each article in over 110 premier clinical journals (about 50 000 articles per year), about 3000 articles (6%) pass muster—that is, have adequate methods to support their conclusions for key aspects of clinical care.
Keep in mind that this is bmjupdates promoting their own service and part of the point of that service is to filter for a few of the best rather than to include everything that might be useful. This is sane and sensible since clinicians cannot read 50K papers per year.
I'm going to withhold judgement on the term "most" meaning 50%+ but I will address the 6% figure.
It is not true that 94% are too badly designed to improve patient care.
To be more specific only 3000 pass these filters, they do not provide a breakdown for why any particular paper failed:
Criteria for Review and Selection for Abstracting:
General
All English-language original and review articles in an issue of a
candidate journal are consdiered for abstracting if they concern
topics improtant to the clinical practice of internal medicine,
general and family practice, surgery, psychiatry, paediatrics, or
obstetrics and gynaecology. Access to foreign-language journals is
provided through the systematic reviews we abstract, especially those
in the Cochrane Library, which summarises articles taken from over 800
journals in several languages.
Prevention or treatment; quality improvement
• Random allocation of participants to interventions
• Outcome measures of known or probable clinical importance for ³ 80% of
the participants who entered the investigation.
Diagnosis
• Inclusion of a spectrum of participants, some (but not
all) of whom have the disorder of derangement of interest
• Each participant must receive the new test and the diagnostic standard test
• Either an objective diagnostic standard or a contemporary clinical
diagnostic standard with demonstrably reproducible criteria for any
subjectively interpreted component
• Interpretation of the test
without knowledge of the diagnostic standard result
• Interpretation
of the diagnostic standard without knowledge of the test result.
Prognosis
• An inception cohort of persons, all initially free of the
outcome of interest
• Follow-up of ³ 80% of patients until the
occurrence of either a major study end point or the end of the study.
Causation
• Observations concerning the relation between exposures and
putative clinical outcome
• Prospective data collection with clearly
identified comparison group(s) for those at risk for the outcome of
interest (in descending order of preference from randomised controlled
trials, quasi-randomised controlled trials, nonrandomised controlled
trials, cohort studies with case by case matching or statistical
adjustment to create comparable groups, to nested case control
studies)
• Masking of observers of outcomes to exposures (this
criterion is assumed to be met if the outcome is objective).
Economics of health care programmes or intervention
• The economic
question must compare alternative courses of action in real or
hypothetical patients
• The alternative diagnostic or therapeutic
services or quality improvement strategies must be compared on the
basis of both the outcomes they produce (effectiveness) and the
resources they consume (costs)
• Evidence of effectiveness must come
from a study (or studies) that meets criteria for diagnosis,
treatment, quality assurance, or review articles
• Results should be
presented in terms of the incremental or additional costs and outcomes
incurred and a sensitivity analysis should be done.
Clinical prediction guides
• The guide must be generated in 1 set of
patients (training set) and validated in an independent set of real
not hypothetical patients (test set), and must pertain to treatment,
diagnosis, prognosis, or causation.
Differential diagnosis
• A cohort of patients who present with a
similar, initially undiagnosed but reproducibly defined clinical
problem
• Clinical setting is explicitly described
• Ascertainment of
diagnosis for 80% of patients using a reproducible diagnostic workup
strategy and follow up until patients are diagnosed or follow up of 1
month for acute disorders or
• ³1 year for chronic or relapsing
disorders.
Systematic reviews
• The clinical topic being reviewed must be clearly
stated; there must be a description of how the evidence on this topic
was tracked down, from what sources, and with what inclusion and
exclusion criteria
• ³1 article included in the review must meet the
above-noted criteria for treatment, diagnosis, prognosis, causation,
quality improvement, or the economics of health care programmes.
Source: Purpose and procedure
That does not mean that every paper which fails to pass these filters is terrible or useless.
A paper may still be informative or useful but they may be harder to parse, harder to compare to each other results or the subject being studied may render it impossible to meet the criteria above.
For example a paper which is simply a case study of an individual patient with a rare form of brain damage may be quite informative to neurologists but there's no way you could ethically randomly brain damage a cohort of people.
These filters simply allow us to find the ones which are the most informative and most easily used in combination with other data. They might be described as the most systematically useful but they're not the only useful ones.