I was waiting for somebody with much better understanding of linguistics theory to take a stab at this. But no one has come forth and, in the meantime, many people have started questioning the pertinence of the question itself, so let me give it a try:
First off (contrary to what many of the comments above have suggested): dialectal accents are extremely well defined in linguistics. There is such a thing as an "objective" classification of regional English accents, based on well-defined phonetic criteria.
Two of the most common (and strongest) differences between English accents are:
Pronunciation of rhotic consonants. E.g.: words like 'metal' and 'medal' will sound more or less alike depending on the speaker's regional accent.
Pronunciation of diphthongs. E.g: 'low' vs 'loud' vs 'lout' etc.
Many research papers make mention of a "neutral regional accent", as mainly defined with regard to these two characteristics (moderately rhotic consonants, vowels that tend toward pure vowels and diphthongs that tend to get monophthongized).
At the same time, the standard Western view on good singing diction encourages pure vowels and clearly enunciated consonants, removing many of the degrees of freedom differentiating between regional accents.
And thus, there is an objective, phonetics-based, rationale for singing accents' tendency to converge toward a "neutral" accent (perhaps misidentified as "American", due to some of the neutral features of Western and Northeastern American dialects, compared to the strongly non-rhottic UK and Australian dialects).
While I would expect there to be some scientific literature detailing the topic, this is not my field and all I was able to find through a cursory search on Google Scholar was this musicology article:
"Vocal Diction" -- In a Nutshell, by T. Campbell Young. London, 1932
Ancient as it may be, it seems its musicology/phonetics contents should still hold by modern scientific standards.
Its lengthy technical description of diction standards of sung English is introduced by the following general remark:
It is equally true to say that
language, in song, has been
standardized to such an extent that it
has become universal and homogeneous.
It follows naturally that when words
and music are allied, the former must
be pronounced in such a way as to
conform with the accepted principles
of good singing.
This, along with the above notes on the phonetics of regional accents, will hopefully do as a placeholder answer, until somebody with much deeper knowledge of linguistics than I, cares to step in and give some stronger references.