Generally height are generated procedurally in shaders for vertices.
By procedurally in computer graphics it means by some mathematics algorithm. Perlin noise is one of the methods for this procedural generation. There are several strategies keep the height map of small size and produce different heights using procedural method this is done as height map is texture and that uses bandwidth.
Tessellation shaders are used along for adaptive tessellation. You can think of it as some kind of level of detail mechanism. Smoothness of terrain depends upon how many triangles are used to represent patch on terrain. Depending on the distance of pixel from camera developers can decide what should be tessellation level on the fly and generate more triangles for patches close to user. This is way to improve details on the terrain. Everything here is happening on the GPU so its extremely efficient.
Previous to tessellation shaders were accessibe there were algorithms like ROAR which used to do adaptive tessellation on the CPU.
Please follow http://vterrain.org/ this project. You will see all state of the terrain techniques implemented here.