I will describe below how to implement a batch process in Alfresco.
Before going into details, I would like to suggest also to integrate this process with Activiti workflows (or JBPM if you prefer).
As described later, the process will send events to notify listeners on the progress of the job. The listener of these events can call Jenkins directly.
Instead of calling directly Jenkins, the listener can update a workflow. In this case the logic to call Jenkins will be implemented in a workflow task. This makes it easier to separate the logic of the batch process from the logic of the notifier. Moreover, the workflow could be used also to store information about the progress of the job. This information can be eventually polled by someone/something interested.
Long running process:
I do not know what version of Alfresco you are using and I will describe a solution for version 4.1.
Alfresco supports long running batch processes mainly with classes and interface in the package org.alfresco.repo.batch:
BatchProcessWorkProvider
BatchProcessor
BatchProcessor.BatchProcessWorker
BatchMonitor
BatchMonitorEvent.java
You will need to provide implementation for the two interfaces: BatchProcessorWorkProvider and BatchProcessor.BatchProcessWorker:
Both interfaces are attached below. The first one returns the work loads and the second defines what a worker is.
BatchProcessor:
/**
* An interface that provides work loads to the {@link BatchProcessor}.
*
* @author Derek Hulley
* @since 3.4
*/
public interface BatchProcessWorkProvider<T>
{
/**
* Get an estimate of the total number of objects that will be provided by this instance.
* Instances can provide accurate answers on each call, but only if the answer can be
* provided quickly and efficiently; usually it is enough to to cache the result after
* providing an initial estimate.
*
* @return a total work size estimate
*/
int getTotalEstimatedWorkSize();
/**
* Get the next lot of work for the batch processor. Implementations should return
* the largest number of entries possible; the {@link BatchProcessor} will keep calling
* this method until it has enough work for the individual worker threads to process
* or until the work load is empty.
*
* @return the next set of work object to process or an empty collection
* if there is no more work remaining.
*/
Collection<T> getNextWork();
}
BatchProcessWorker:
/**
* An interface for workers to be invoked by the {@link BatchProcessor}.
*/
public interface BatchProcessWorker<T>
{
/**
* Gets an identifier for the given entry (for monitoring / logging purposes).
*
* @param entry
* the entry
* @return the identifier
*/
public String getIdentifier(T entry);
/**
* Callback to allow thread initialization before the work entries are
* {@link #process(Object) processed}. Typically, this will include authenticating
* as a valid user and disbling or enabling any system flags that might affect the
* entry processing.
*/
public void beforeProcess() throws Throwable;
/**
* Processes the given entry.
*
* @param entry
* the entry
* @throws Throwable
* on any error
*/
public void process(T entry) throws Throwable;
/**
* Callback to allow thread cleanup after the work entries have been
* {@link #process(Object) processed}.
* Typically, this will involve cleanup of authentication and resetting any
* system flags previously set.
* <p/>
* This call is made regardless of the outcome of the entry processing.
*/
public void afterProcess() throws Throwable;
}
In practice BatchProcessWorkProvider returns a collection of "work to do" (the "T" class). The "work to do" is a class that you need to provide. In your case this class can provide the information to extract a subset of the files from the remote system. The method process will use this information to actually do the job. Just as an example, in your case, we can call T, ImportFiles.
Your BatchProcessWorkProvider should partition the list of files into a collection of ImportFiles of a reasonable size.
The "most important" method in BatchProcessWorker is
public void process(ImportFiles filesToImport) throws Throwable;
This is the method that you have to implement. For the other methods there is an adapter BatchProcess.BatchProcessWorkerAdapter that provides a default implementation.
The process method receive as paramter an ImportFiles and can use it to find the files in the remote servers and import them.
Finally, you need to instantiate a BatchProcessor:
try {
final RetryingTransactionHelper retryingTransactionHelper = transactionService.getRetryingTransactionHelper();
BatchProcessor<ImportFiles> batchProcessor = new BatchProcessor<ImportFiles>(processName,
retryingTransactionHelper, workProvider, threads, batchSize,
applicationEventPublisher, logger, loggingInterval);
batchProcessor.process(worker, true);
}
catch (LockAcquisitionException e) {
/* Manage exception */
}
Where
processName: a description of the long running process
workProvider an instance of the BatchProcessWorkProvider
threads: the number of worker threads (in parallel)
batchSize: the number of entries to process in the same transaction
logger: the logger to use for reporting the progress
loggingInterval: the number of entries to process before reporting progress
retryingTransactionHelper: is the helper class to retry the transaction if there is a failure for concurrent update (an optimistic locking) or deadlock condition.
applicationEventPublisher: this is an instance of the Spring ApplicationEventPublisher that is usually (and also for Alfresco) the Spring ApplicationContext.
To send events to Jenkins you can use the applicationEventPublisher. The following link describes how to use it. It is a standard functionality of Spring.
Spring events
An event can be, for example sent by the method
process(ImportFiles filesToImport)
described above.