-1

I'm reading a list of chemical symbols. Since there are 118 elements, a select case construct would have 119 cases. Is there a better way to do this? Some elements begin with the same letter, such as C, Ca, Cd, Co, so perhaps reading three A1 variables instead of one A3 variable could ease the process somewhat.

program case_test
implicit none

character(len=3)   :: input
integer            :: i
real               :: mass

write(*,*) "Give me a symbol"

read(*,"(A3)") input

select case (input)
  case("H")
    mass = 1.008
  case("He")
    mass = 4.003
! 116 other checks
  case default
    write(*,*) "Unknown element ", input
    stop
end select

write(*,*) "atom mass = ", mass
end program case_test
Kurzd
  • 99
  • 1
  • 4
    I don't understand what your metric for complexity is here. Lines of code? Readability? Ultimately, you have to specify 118 different parameters for 118 different 2-letter strings. – Ross Jul 20 '17 at 22:15
  • Readability is one concern, as is picking a good way to do this. Just a `case` construct is easy to think of but perhaps there's a faster, more readable way I wouldn't be able to come up with on my own. – Kurzd Aug 03 '17 at 04:15

2 Answers2

2

In general I like SELECTED CASE, but this seems more straight forward...

MODULE Element_Definintions
IMPLICIT NONE

PUBLIC

INTEGER, PARAMETER, PUBLIC :: Max_Elements = 118

TYPE Elements_Type
  character(len=3)   :: Name
  integer            :: i
  real               :: mass
END TYPE Elements_Type

TYPE(Elements_Type), DIMENSION(Max_Elements), PUBLIC :: Elements

CONTAINS

SUBROUTINE Init_Elements
IMPLICIT NONE

Element(1).Name = "H"
Element(1).Num  = 1
Element(1).Mass = 1.008

Element(2).Name = "He"
Element(2).Num  = 2
Element(2).Mass = 4.008

!...
Element(118).Name = ""

RETURN
END SUBROUTINE Init_Elements

END MODULE Element_Definintions

Then the program...

program case_test
USE Element_Definintions
implicit none
character(len=3) :: input
LOGICAL          :: Found = .FALSE.

CALL Init_Elements()

write(*,*) "Give me a symbol"

read(*,"(A3)") input

DO I = 1, Max_Elements
  IF(Input(1:LEN_TRIM(Input)) == Element(I).Name(1:LEN_TRIM(Element(I).Name)) THEN
    FOUND= .TRUE.
    EXIT
  ELSE
    CYCLE
  ENDIF
ENDDO

IF(Found) THEN
  write(*,*) 'atom mass of "',Element(I).Name,'" = ', mass
ELSE
  write(*,*) 'Unknown element "', input,'"'
ENDIF

END program case_test
Holmz
  • 714
  • 7
  • 14
  • Depending on how often lookups are done, you might consider sorting the names alphabetically, then using a binary search to find the matching record. Otherwise, a hash table relating name to index may also give you much better search performance. No point in doing either until you profile the application to see if performance is an issue. Still, there's room for optimization if needed. – arclight Aug 04 '17 at 04:58
  • `Element(1).Name` should be `Element(1)%Name`, etc ? – jiadong Mar 25 '21 at 01:59
  • Yeah the % and "." Are interchangable in some compilers, so it is easy to move away from the standard. – Holmz Mar 25 '21 at 21:53
2

There is a small, finite set of element symbols (I believe 118 at last count) which can fit in a not-too-long text string. Fortran isn't the best language for text processing (I'll let Perl and SNOBOL fight over that...) but modern Fortran has improved matters a bit.

There are some assumptions based in the following code. First, we expect the user to type the element symbol in using mixed case. You could correct case but for this example I decided to just throw errors if the first character in user input wasn't upper case and the second character wasn't a space or lower case. This is done using the verify intrinsic which was added to the language in Fortran 95 (this is literally the first time I've used it).

The index() intrinsic will give you the starting location of the first match of a substring inside a larger string counting from 1 or will return 0 if no match is found.

The elblob string contains every element symbol separated by an underscore. Single letter elements retain a trailing space to match a character(len=2) variable. Two asterisks are stuck at the front of elblob so that every element starts on a character divisible by three. This is a dumb bit of magic which takes advantage of what we know about atomic numbers - they are unique, sequential, integers which completely fill the range from 1 to 118 (or whatever the top element is these days).

Another sneaky bit that probably solves a non-problem is the use of adjustl() to ensure the first character in elseek is not a space. It probably can't be just because of the way Fortran's read() works but I'm paranoid so I put it in there. The worst it'll do is burn a few cycles not doing anything. Take it out and see what happens.

By sanity-checking the user input to keep out stray '_'s and '*'s, we can be assured that element symbols will correctly match and that we can get the real atomic number by dividing the match position returned by index() by three. Carbon won't accidentally match calcium because carbon's search string is 'C ', not 'C', an effect of Fortran's fixed-length strings. If elseek was defined as character(len=:), allocatable, we might have problems but by using dumb old fixed-length space-padded strings we can use their dumb old behavior to our advantage.

!> Return an element's atomic number based on its symbol.
program elements
    use iso_fortran_env, only: input_unit, output_unit
    implicit none

    character(len=*), parameter :: alpha_u = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    character(len=*), parameter :: alpha_l = 'abcdefghijklmnopqrstuvwxyz'

    character(len=*), parameter :: elblob =                             &
    '**H _He_Li_Be_B _C _N _O _F _Ne_Na_Mg_Al_Si_P _S _Cl_Ar_K _Ca_Sc_' &
 // 'Ti_V _Cr_Mn_Fe_Co_Ni_Cu_Zn_Ga_Ge_As_Se_Br_Kr_Rb_Sr_Y _Zr_Nb_Mo_'   &
 // 'Tc_Ru_Rh_Pd_Ag_Cd_In_Sn_Sb_Te_I _Xe_Cs_Ba_La_Ce_Pr_Nd_Pm_Sm_Eu_'   &
 // 'Gd_Tb_Dy_Ho_Er_Tm_Yb_Lu_Hf_Ta_W _Re_Os_Ir_Pt_Au_Hg_Tl_Pb_Bi_Po_'   &
 // 'At_Rn_Fr_Ra_Ac_Th_Pa_U _Np_Pu_Am_Cm_Bk_Cf_Es_Fm_Md_No_Lr_Rf_Db_'   &
 // 'Sg_Bh_Hs_Mt_Ds_Rg_Cn_Nh_Fl_Mc_Lv_Ts_Og'

    character(len=2) :: elseek
    character(len=1) :: c
    integer :: atomic_number

404 format("Sorry, I couldn't find ", '"', A, '"')
200 format("The element ", A, " has an atomic number of ", I0)
500 format('"', A, '" must be ', A, ' case letter')

    continue

    write(output_unit, '(A)') "Give me an element's symbol "            &
        // "(like H or Na)"

    read(input_unit, '(A2)') elseek

    ! Left-justify; eliminates any leading space
    ! (Q: is leading space even possible?)
    elseek = adjustl(elseek)

    c = elseek(1:1)
    if (verify(c, alpha_u) > 0) then
        write(output_unit, 500) c, 'an upper'
        stop(1)
    end if

    c = elseek(2:2)
    if (verify(c, alpha_l // ' ') > 0) then
        write(output_unit, 500) c, 'a lower'
        stop(2)
    end if

    atomic_number = index(elblob, elseek)
    if (atomic_number < 1) then
        write(output_unit, 404) elseek
    else
        atomic_number = atomic_number / 3
        write(output_unit, 200) elseek, atomic_number
    end if

end program elements

Anyway, I tested this for all of 30 seconds. Don't use it for anything safety-critical. If this is for homework, at least read and understand the code and rewrite it as your own so you don't get flagged by Turnitin.

This is not the most robust solution but it is short and simple and meets the requirements as written. It doesn't require a hash table or a search tree or any data structure more complicated than a string. It wouldn't take much to dumb it down to work under FORTRAN77, but that way lies madness...

arclight
  • 1,608
  • 1
  • 15
  • 19