6

I have a prototype storm app that reads a STOMP stream and stores the output on HBase. It works, but is not very flexible and I'm trying to get it set up in a more consistent way with the rest of our apps, but not having much luck figuring out how the current way of working with Storm. We use spring-jms classes, but instead of using them in the standard spring way, they are being created at run time, and setting dependencies manually.

This project: https://github.com/granthenke/storm-spring looked promising, but it hasn't been touched in a couple years and doesn't build properly since the storm jars have been taken into apache incubator and repackaged.

Is there something I'm missing, or is it not worth my while to get these things integrated?

liam
  • 3,830
  • 6
  • 32
  • 31
  • 2
    You mean you want to configure Storm with a Spring-based configuration for spouts / bolts and properties? – zenbeni Jul 07 '14 at 09:26
  • Exactly. We are using spring-jms and some other stuff in there, and the builder pattern that storm uses and the IOC pattern spring uses are a bit at odds. I'm fine with taking the hit that the extra XML will introduce if it keeps our codebase consistent with the rest of our java projects. – liam Jul 07 '14 at 20:52

4 Answers4

7

@zenbeni has answered this question but I wanna tell you about my implementation, it's hard to make spouts/bolts as spring beans. But to use other spring spring beans inside your spouts/bolts you could declare a global variable & in your execute method check whtether variable is null or not. If it's null you have to get bean from application context. Create a class which contains a method to initialize beans if it's not initialized already. Look ApplicationContextAware interface for more information(Spring bean reuse).

Example Code:

Bolt Class:

public class Class1 implements IRichBolt{
    Class2 class2Object;

    public void prepare() {
        if (class2Object== null) {          
            class2Object= (Class2) Util
                .initializeContext("class2");
        }
    }
}

Util Class for initializing Beans if not initialized already:

public class Util{
    public static Object initializeContext(String beanName) {
        Object bean = null;
        try {
            synchronized (Util.class) {
                if (ApplicationContextUtil.getAppContext() == null) {
                    ApplicationContext appContext = new ClassPathXmlApplicationContext("beans.xml");
                    bean = ApplicationContextUtil.getAppContext().getBean(
                        beanName);
                } else {
                    bean = ApplicationContextUtil.getAppContext().getBean(
                        beanName);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

        return bean;
    }
}

Listener for Application Context Change:

@Component
public class ApplicationContextUtil implements ApplicationContextAware {

    private static ApplicationContext appContext;

    public void setApplicationContext(ApplicationContext applicationContext)
        throws BeansException {
        appContext = applicationContext;
    }

    public static ApplicationContext getAppContext() {
        return appContext;
    }
}

Note: Each Worker will initialize spring context because it's running in different JVM.

UPDATE

If you want to use a spring bean class in which you have some values previously assigned try this,

Note: Passing the current class to Bolt's constructor

Class (Topology Creation Class) which is already contains values:

public class StormTopologyClass implements ITopologyBuilder, Serializable {
    public Map<String, String> attributes = new HashMap<String, String>();

    TopologyBuilder builder=new TopologyBuilder();
    builder.setBolt("Class1",new Class1(this));
    builder.createTopology();
}

Bolt by using single argument constructor:

public class Class1 implements IRichBolt{
    StormTopologyClass topology;

    public Class1 (StormTopologyClass topology) {
        this.topology = topology;
    }
}

Now you could use attributes variable & it's values in bolt class.

Ajeesh
  • 1,572
  • 3
  • 19
  • 32
  • Great comment. This is what we are going through right now. – markthegrea Jan 19 '16 at 21:25
  • @markthegrea you can do initialization in prepare method because execute method will run each time a new tuple comes in. – Ajeesh Jan 20 '16 at 05:12
  • Yes, that is what we discovered, but we have come to the conclusion that Spring is just the incorrect paradigm for Bolts. These bolts can be moved to many jvms and such and the beauty of Spring are the singletons. So, why use Spring when you have to create 20 copies of an object. Also, you have to start spring up in each instance and that takes time. – markthegrea Jan 22 '16 at 15:08
  • @markthegrea That's right, for each worker spring beans will be initialized so sharing of data between workers are not possible, that's because each worker runs in different jvm. – Ajeesh Jan 29 '16 at 09:51
6

In fact, storm-spring seems to be what you are looking for but it is not updated and have limitations (cannot define tasks on bolts / spouts for instance, etc). Maybe you should roll your own integration?

Don't forget your target: a cluster with many workers. How does spring behave when you will deploy your topology with storm api (rebalance for instance) on one more worker? Does it mean it has to instanciate a new Spring context on the worker JVM at startup before Storm deploys the targeted bolts / spouts and defines the executors?

IMHO if you define only Storm components in a Spring configuration it should work (startup configuration for the topology then storm only manages the objects) but if you rely on Spring to manage other components (it seems so with spring-jms), then it could become messy on topology rebalances for instance (singleton per worker / jvm? Or the whole topology?).

It is up to you to decide if it is worth the trouble, my concern with a Spring configuration is that you easily forget the storm topology (it seems it is one JVM but can be many more). Personally I define my own singletons per class-loader (static final for instance or with double check locking if I need deferred instanciation), as it does not hide the (medium-high) complexity.

zenbeni
  • 7,019
  • 3
  • 29
  • 60
  • Good point. I'm accepting this one and I'll re-evaluate how we're putting the app together. – liam Jul 08 '14 at 16:35
  • @zenbeni I have some service layer classes which needs to be injected in storm topology. I could not create them normally because they in turn have dependencies which are to be autowired by spring. Now, what can I do? – JavaTechnical Feb 14 '15 at 11:07
1

I realize that this is very after the fact, but did you think of using Apache camel for the JMS connection handling? Camel isn't IOC or DI, but it does model enterprise integration patterns. Maybe that's what you are (were?) looking for?

Nick.

Nick
  • 11
  • 2
-1

Maybe this tutorial can help you.

http://spring.io/guides/gs/messaging-stomp-websocket/

Dani
  • 4,001
  • 7
  • 36
  • 60
  • That has nothing to do with the question I asked, which is integrating JMS with Storm for computation. – liam Jul 03 '14 at 21:35
  • Ok, I am sorry, understood wrong. See StompConnect. http://docs.codehaus.org/display/STOMP/StompConnect – Dani Jul 04 '14 at 05:18