The tag is the label you want to apply to the token. For instance O, PERSON, LOCATION, ORGANIZATION, PROGRAMMING_LANGUAGE. O means not an entity.
A feature is an aspect of the token stream you want the CRF Classifier to use in its decision.
Consider the sentence "I went to France last summer."
The tags would be [O O O LOCATION O O O].
For instance a feature could be the word itself, "word=France".
A feature could be the last two words before the current word in the sequence "word_n-2_n-1=went to".
Or a feature could be something like the shape of the word "shape=Xxxxxx"
The point of the features is that the CRF Classifier can find patterns, for instance that words with particular shapes tend to be O, or that particular words tend to belong to particular classes.
You do not need custom features if you simply want to add new categories such as PROGRAMMING_LANGUAGE or OPERATING_SYSTEM. You just need training data so the system can learn how to label tokens appropriately.