I experience an issue while indexing my data in a batch.
I want to index an Article
list, with some @IndexedEmbedded
on members where I need to get info. Article
get additional infos from two others beans : Page
and Articlefulltext
.
The batch is updating correctly the database and adds new Document
to my Lucene Index thanks to Hibernate Search Annotations. But the added documents have incomplete fields. It seems that Hibernate Search doesn't see all the annotations.
So when i look at the resulting lucene Index thanks to Luke, i have some fields about both Article and Page objects, but none about ArticleFulltext, but i have correct data in my database, which means that the persist() operation is done correctly ...
I really need some help here, because i don't see in what there is a difference between my Page and ArticleFullText ...
The weird thing is that if I use a MassIndexer
, it correctly add Article + Page + Articlefulltext data into the lucene index. But i don't want to rebuild a millions document index each time i made a big update ...
I set log4j logging level to debug for hibernate search and lucene. They doesn't give me so much informations.
Here are my beans code and batch code.
Thanks in advance for your help,
Article.java :
@Entity
@Table(name = "article", catalog = "test")
@Indexed(index="articleText")
@Analyzer(impl = FrenchAnalyzer.class)
public class Article implements java.io.Serializable {
@Id
@GeneratedValue(strategy = IDENTITY)
@Column(name = "id", unique = true, nullable = false)
@DocumentId
private Integer id;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "firstpageid", nullable = false)
@IndexedEmbedded
private Page page;
@Column(name = "heading", length = 300)
@Field(name= "title", index = Index.YES, store = Store.YES)
@Boost(2.5f)
private String heading;
@Column(name = "subheading", length = 300)
private String subheading;
@OneToOne(fetch = FetchType.LAZY, mappedBy = "article")
@IndexedEmbedded
private Articlefulltext articlefulltext;
[... bean methods etc ...]
Page.java
@Entity
@Table(name = "page", catalog = "test")
public class Page implements java.io.Serializable {
private Integer id;
@IndexedEmbedded
private Issue issue;
@ContainedIn
private Set<Article> articles = new HashSet<Article>(0);
[... bean method ...]
Articlefulltext.java
@Entity
@Table(name = "articlefulltext", catalog = "test")
@Analyzer(impl = FrenchAnalyzer.class)
public class Articlefulltext implements java.io.Serializable {
@GenericGenerator(name = "generator", strategy = "foreign", parameters = @Parameter(name = "property", value = "article"))
@Id
@GeneratedValue(generator = "generator")
@Column(name = "aid", unique = true, nullable = false)
private int aid;
@OneToOne(fetch = FetchType.LAZY)
@PrimaryKeyJoinColumn
@ContainedIn
private Article article;
@Column(name = "fulltextcontents", nullable = false)
@Field(store=Store.YES, index=Index.YES, analyzer = @Analyzer(impl = FrenchAnalyzer.class), bridge= @FieldBridge(impl = FulltextSplitBridge.class))
// This Field is not add to the Resulting Document ! I put a log into FulltextSplitBridge, and it's never called during a batch process. But if I use a MassIndexer, i see that FulltextSplitBridge is called for each Articlefulltext ...
private String fulltextcontents;
[... bean method ...]
And here is the code which is used for updating both Database and Lucene index
Batch Source code :
FullTextEntityManager em = null;
@Override
protected void executeInternal(JobExecutionContext arg0) throws JobExecutionException {
ApplicationContext ap = null;
EntityManagerFactory emf = null;
EntityTransaction tx = null;
try {
ap = (ApplicationContext) arg0.getScheduler().getContext().get("applicationContext");
emf = (EntityManagerFactory) ap.getBean("entityManagerFactory", EntityManagerFactory.class);
em = Search.getFullTextEntityManager(emf.createEntityManager());
tx = em.getTransaction();
tx.begin();
// [... em.persist() some things which aren't lucene related, so i skip them ....]
for(File xmlFile : xmlList){
Reel reel = new Reel(title, reelpath);
em.persist(reel);
Article article = new Article();
// [... set Article fields, so i skip them ....]
Articlefulltext ft = new Articlefulltext();
// [... set Articlefulltext fields, so i skip them ....]
ft.setArticle(article);
ft.setFulltextcontents(bufferBlock.toString());
em.persist(ft); // i persist ft before article because of FK issues
em.persist(article); // there, the Annotation update Lucene index, but there's not updating fultextContent (see my first post)
if ( nbFileDone % 50 == 0 ) {
//flush a batch of inserts and release memory:
em.flush();
em.clear();
}
}
tx.commit();
}
catch(Exception e){
tx.rollback();
}
em.close();
}