The inclusion of the laplacian is needed to ensure scale invariance.
While the scale adapted Harris detector did have very good repeatability (in terms of location of the detection), the scale selection remained a problem. They noticed that
during our experiments we noticed that the adapted Harris function
rarely attains maxima in 3D space. Therefore, we propose to use a
different function, the Laplacian, for scale maxima detection.
(From Indexing based on scale invariant interest points)
This is explained in better detail in his thesis:
In our experiments (cf. section 3.2.4) we noticed that the scale
adapted Harris function rarely attains maxima over scales in a
scale-space representation. If too few interest points are detected,
the image is not reliably represented. Therefore, we abandoned the
idea of searching 3D maxima of the Harris function. Furthermore, the
experiments showed that LoG function enables the highest percentage of
correct characteristic scales to be found. Therefore, we propose to
use the Laplacian to select the scales for points extracted with the
Harris detector. Harris-Laplace detector uses the Harris function (cf.
equation 4.1) to localize points in each level of the scale-space
representation. Next, it selects the points, for which the
Laplacian-of-Gaussian (cf. equation 4.2) attains a maximum over
scale. In this way we combine these two methods to obtain a reliable
interest point detector invariant to signicant scale changes.
I don't have any intuitive explanation for why the Harris function doesn't give many maxima over scales, but empirically, they found this to be the case. Looks like nothing's stopping you from using the Harris scale space maximum, but you'd probably just get many fewer detections.