I wrote that example in the docs, so I guess this is sort of my fault and so I'll attempt to answer the question.
The example doesn't illustrate the difference very well because it's using shapes and not text.
Changing the example to use text + shapes shows that there is some differences between these functions:
#lang racket
(require pict)
(inset
(cbl-superimpose
(hb-append 10
(frame (text "g" "Helvetica" 30))
(rectangle 10 10 #:border-width 2))
(hline 200 2))
10)
(blank 1 30)
(inset
(cbl-superimpose
(hbl-append 10
(frame (text "g" "Helvetica" 30))
(rectangle 10 10 #:border-width 2))
(hline 200 2))
10)
(blank 1 30)
(inset
(ctl-superimpose
(ht-append 10
(frame (text "i" "Helvetica" 30))
(rectangle 10 10 #:border-width 2))
(hline 200 2))
10)
(blank 1 30)
(inset
(ctl-superimpose
(htl-append 10
(frame (text "i" "Helvetica" 30))
(rectangle 10 10 #:border-width 2))
(hline 200 2))
10)
If you run this example you'll get 4 pictures that show different cases. Depending on the letters, you get different alignments due to the ascenders/descenders. It would probably be more useful for the docs to show an example similar to this one with text.
If you want to mix pictures and text it often makes sense to use the l-
variants to avoid an odd look where the picture sticks out:
#lang racket
(require pict)
(hb-append 10
(text "hug" "Helvetica" 30)
(rectangle 20 20 #:border-width 2)
(text "hug" "Helvetica" 30))
(hbl-append 10
(text "hug" "Helvetica" 30)
(rectangle 20 20 #:border-width 2)
(text "hug" "Helvetica" 30))