The rpart author allowed me to use his answer, which I paste below:
train <- titanic_train
names(train) <- tolower(names(train)) # I'm lazy
train$pclass <- factor(train$pclass)
fit1 <- rpart(survived ~ pclass + sex, data=train)
fit2 <- rpart(survived ~ pclass + sex, data=train, method="class")
fit1
n= 891
node), split, n, deviance, yval
* denotes terminal node
1) root 891 210.727300 0.3838384
2) sex=male 577 88.409010 0.1889081
4) pclass=2,3 455 54.997800 0.1406593 *
5) pclass=1 122 28.401640 0.3688525 *
3) sex=female 314 60.105100 0.7420382
6) pclass=3 144 36.000000 0.5000000 *
7) pclass=1,2 170 8.523529 0.9470588 *
fit2
n= 891
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 891 342 0 (0.6161616 0.3838384)
2) sex=male 577 109 0 (0.8110919 0.1889081) *
3) sex=female 314 81 1 (0.2579618 0.7420382) *
The issue: when you choose "classification" as the method, either explicitly like I did above or implicitly by setting the outcome to a factor, you have declared that the loss function is a simple "correct/incorrect" for alive/dead. For males, the survival rate is .189, which is < .5, so they class as 0. The next split below gives rates of .14 and .37, both of which are < .5, both are then treated as 0. The second split did not improve the model, according to the criteria that you chose. With or without it all males are a "0", so no need for the second split.
Ditto for the females: the overall and the two subclasses are both >= .5, so the second split does not improve prediction, according to the criteria that you selected.
When I leave the response as continuous, then the final criteria is MSE, and the further splits are counted as an improvement.