For this question
And why they define its formula like that ?
Explanation: Functional margin doesn't tell us about the exact distance or measurement of different points to the separating plane/line.
For instance, just consider following lines they are same but functional margin would vary (a limitation of functional margin).
2*x + 3*y + 1 = 0
4*x + 6*y + 2 = 0
20*x + 30*y +10 = 0
Functional Margin just give an idea about the confidence of our classification, nothing concrete.
Please also read below reference for more details.
Referenced Andrew NG's lecture notes, please click here for more details
If y(i) = 1, then for the functional margin to be large (i.e., for our prediction to be confident and correct), we need wTx + b to be a large positive number. Conversely, if y(i) = −1, then for the functional margin to be large, we need wTx + b to be a large negative number. Moreover, if y(i)(wTx + b) > 0, then our prediction on this example is correct. (Check this yourself.) Hence, a large functional margin represents a confident and a correct prediction.