This is not always necessary, but the idea is that if the categorical attribute covers all the space (i.e. your dummy variables represent all the possible values for the attribute), then the last dummy variable can be perfectly predicted by the other N-1 dummies:
last_dummy = 1 if all sum(dummies[:N-1]) == 0 else 0
This introduces a heavy collinearity between your dummy variables (which is a very undesirable thing in linear/logistic regression) and that's why it is called dummy variable trap.
Usually, the way of fixing this this problem is to just remove the one dummy column (any would do, it does not have to be the last one). This removes the source of collinearity and, since the dummy could be predicted by the rest anyway, there is no loss of information at all from the original dataset.