I want to create a document term matrix using native R (without additional plugins such as tm). The data is structured as follows:
Doc1: the test was to test the test
Doc2: we did prepare the exam to test the exam
Doc3: was the test the exam
Doc4: the exam we did prepare was to test the test
Doc5: we were successful so we all passed the exam
What i want to reach is the following:
Term Doc1 Doc2 Doc3 Doc4 Doc5 DF
1 all 0 0 0 0 1 1
2 did 0 1 0 1 0 2
3 exam 0 2 1 1 1 4
4 passed 0 0 0 0 1 1