I want to create a small search angine for tweets. I have a txt file with 20000 tweets. The file format is like:
TommyFrench1
851
85170333395811123
Lurgan, Moira, Armagh. Derry
This week we are double delight on first goalscorers on the four Champions League matches in shop. ChampionsLeagueIm_Aarkay
175
851703414300037122
Paris
@ChampionsLeague @AS_Monaco @AS_Monaco_EN Nopes, it's when City knocked outta Champions league. .
.
etc
The first line is the username
, secondly I have the followers
, next is the id
and the location
and last is the text(tweet)
.
I think that every tweet is a document. So i must have 20000 documents and every document must have 5 fields(username,followers,id etc).
How can i make the indexing?
I have seen some tutorials but i didn't found something similar
EDIT: Here is my code.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.nio.file.Paths;
import java.text.ParseException;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
public class MyProgram {
public static void main(String[] args) throws IOException, ParseException {
FileReader fileReader = new FileReader(new File("myfile.txt"));
BufferedReader br = new BufferedReader(fileReader);
String line = null;
String indexPath = "C:\\Desktop\\myfolder";
Directory dir = FSDirectory.open(Paths.get(indexPath));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
IndexWriter writer = new IndexWriter(dir, iwc);
while ((line = br.readLine()) != null) {
// reading lines until the end of the file
Document doc = new Document();
String username = br.readLine();
doc.add(new Field("username", username, Field.Store.YES, Field.Index.ANALYZED)); // adding title field
String followers = br.readLine();
doc.add(new Field("followers", followers, Field.Store.YES, Field.Index.ANALYZED));
String id = br.readLine();
doc.add(new Field("id", id, Field.Store.YES, Field.Index.ANALYZED));
String location = br.readLine();
doc.add(new Field("location", location, Field.Store.YES, Field.Index.ANALYZED));
String text = br.readLine();
doc.add(new Field("text", text, Field.Store.YES, Field.Index.ANALYZED));
writer.addDocument(doc); // writing new document to the index
br.readLine();
}
}
}
Im getting the following error:
Index cannot be resolved or is not a field
.
How can i fix this?