It's not completely clear what you're asking, as you seem to have many concepts mixed-up together. (Word2Vec gives vectors per word, not character; word-embeddings are a kind of feature-extraction on words, rather than an alternative to 'feature extraction'; etc. So: I doubt your understanding is yet correct.)
"Feature extraction" is a very general term, meaning any and all ways of taking your original data (such as a sentence) and creating a numerical representation that's good for other kinds of calculation or downstream machine-learning.
One simple way to turn a corpus of sentences into numerical data is to use a "one-hot" encoding of which words appear in each sentence. For example, if you have the two sentences...
['A', 'pen', 'will', 'need', 'ink']
['I', 'have', 'a', 'pen']
...then you have 7 unique case-flattened words...
['a', 'pen', 'will', 'need', 'ink', 'i', 'have']
...and you could "one-hot" the two sentences as a 1-or-0 for each word they contain, and thus get the 7-dimensional vectors:
[1, 1, 1, 1, 1, 0, 0] # A pen will need ink
[1, 1, 0, 0, 0, 1, 1] # I have a pen
Even with this simple encoding, you can now compare sentences mathematically: a euclidean-distance or cosine-distance calculation between those two vectors will give you a summary distance number, and sentences with no shared words will have a high 'distance', and those with many shared words will have a small 'distance'.
Other very-similar possible alternative feature-encodings of these sentences might involve counts of each word (if a word appeared more than once, a number higher than 1
could appear), or weighted-counts (where words get an extra significance factor by some measure, such as the common "TF/IDF" calculation, and thus values scaled to be anywhere from 0.0 to values higher than 1.0).
Note that you can't encode a single sentence as a vector that's just as wide as its own words, such as "I have a pen" into a 4-dimensional [1, 1, 1, 1]
vector. That then isn't comparable to any other sentence. They all need to be converted to the same-dimensional-size vector, and in "one hot" (or other simple "bag of words") encodings, that vector is of dimensionality equal to the total vocabulary known among all sentences.
Word2Vec
is a way to turn individual words into "dense" embeddings with fewer dimensions but many non-zero floating-point values in those dimensions. This is instead of sparse embeddings, which have many dimensions that are mostly zero. The 7-dimensional sparse embedding of 'pen' alone from above would be:
[0, 1, 0, 0, 0, 0, 0] # 'pen'
If you trained a 2-dimensional Word2Vec model, it might instead have a dense embedding like:
[0.236, -0.711] # 'pen'
All the 7 words would have their own 2-dimensional dense embeddings. For example (all values made up):
[-0.101, 0.271] # 'a'
[0.236, -0.711] # 'pen'
[0.302, 0.293] # 'will'
[0.672, -0.026] # 'need'
[-0.198, -0.203] # 'ink'
[0.734, -0.345] # 'i'
[0.288, -0.549] # 'have'
If you have Word2Vec
vectors, then one alternative simple way to make a vector for a longer text, like a sentence, is to average together all the word-vectors for the words in the sentence. So, instead of a 7-dimensional sparse vector for the sentence, like:
[1, 1, 0, 0, 0, 1, 1] # I have a pen
...you'd get a single 2-dimensional dense vector like:
[ 0.28925, -0.3335 ] # I have a pen
And again different sentences may be usefully comparable to each other based on these dense-embedding features, by distance. Or these might work well as training data for a downstream machine-learning process.
So, this is a form of "feature extraction" that uses Word2Vec
instead of simple word-counts. There are many other more sophisticated ways to turn text into vectors; they could all count as kinds of "feature extraction".
Which works best for your needs will depend on your data and ultimate goals. Often the most-simple techniques work best, especially once you have a lot of data. But there are few absolute certainties, and you often need to just try many alternatives, and test how well they do in some quantitative, repeatable scoring evaluation, to find which is best for your project.