0

I have been struggling with this problem for the past few days.

Requirement In our application, we are uploading a file through UI developed through nodejs and then the file records are processed via amazon simple work flow (SWF). The call to amazon SWF happens through a Spring App which the nodejs app will call upon processing the file. The requirement is that, for every file being processed, the application needs to create a log file, detailing what happened to the records as they were processed.

How I have implemented? In the spring application, which triggers the SWF, I created a FileLogger class which will maintain a static StringBuffer variable. This fileLogger class is set to workflow scope, meaning, the class will be created for every execution of the workflow and destroyed at the end of it. As the file is processed, I would keep appending the logs to the StringBuffer in the FileLogger class and at the end of the processing, would write to a file and save it.

Problem description This solution seemed to be working fine as long as we had only one instance of the application running. As soon as we deployed the application into multiple amazon ec-2 instances, it appears that the incomplete logs are saved in the file. Further looking into it revealed that, every instance of the application is having its own stringBuffer to maintain the log and when we are writing to the application only reads one of the stringbuffers contents and hence the incomplete logs. The log pattern, needless to say, is random. I observed we would have N instances of StringBuffer if we deploy N instances of the application.

Here is the FileLogger class

private static final Logger logger = LoggerFactory.getLogger(FileLogger.class);

    //private static final Logger logger = LoggerFactory.getLogger(FileLogger.class);   
    private static final SimpleDateFormat logFileDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
    private static final SimpleDateFormat fileNameDateFormat = new SimpleDateFormat("yyyyMMddHHmmss");
    private static final String COLON = ":";
    private static StringBuffer logAppender = null;

    public synchronized void debug(Date date, String logMessage){
        if(appConfig.getLogLevel().equalsIgnoreCase(LogLevel.DEBUG.name())){
            this.log(LogLevel.DEBUG.name(), date, logMessage);
        }
    }

    public synchronized void info(Date date, String logMessage){
        if(appConfig.getLogLevel().equalsIgnoreCase(LogLevel.INFO.name()) || 
                appConfig.getLogLevel().equalsIgnoreCase(LogLevel.DEBUG.name())){
            this.log(LogLevel.INFO.name(), date, logMessage);
        }
    }

    public synchronized void error(Date date, String logMessage){
        if(appConfig.getLogLevel().equalsIgnoreCase(LogLevel.ERROR.name()) ||
                appConfig.getLogLevel().equalsIgnoreCase(LogLevel.DEBUG.name())){
            this.log(LogLevel.ERROR.name(), date, logMessage);
        }
    }

    private synchronized void log(String logLevel, Date date, String logMessage){
        logger.info("logAppender Hashcode: "+logAppender.hashCode());
        if(!logLevel.equalsIgnoreCase(LogLevel.NONE.name())){
            //StringBuffer logAppender = getLogAppender();
            getLogAppender().append(getLogAppender().hashCode());
            getLogAppender().append(COLON);
            getLogAppender().append(getFormattedDate(date, logFileDateFormat));
            getLogAppender().append(COLON);
            getLogAppender().append(logLevel);
            getLogAppender().append(COLON);
            getLogAppender().append(logMessage);
            getLogAppender().append(System.getProperty("line.separator"));
        }
    }
private synchronized StringBuffer getLogAppender(){
        logger.info("Getting logAppender .."+logAppender);
        if(logAppender == null){
            logger.info("Log appender is null");
            logAppender = new StringBuffer();
        }
        return logAppender;
    }

Question How do I make sure there is only one instance of the StringBuffer (logAppender) across multiple instances of my application to which I can keep appending the log and then read at the end and write the contents to the file before saving it?

cooler
  • 753
  • 4
  • 9
  • 18
  • Why is your class called a `FileLogger` when it appears not to do anything with files? I'm *deeply* suspicious of logging which doesn't actually persist anything for a long time... it means if you get a sudden crash (which is when you really want logs) you lose everything. Why aren't you *actually* logging to a file? And how would you *expect* there to be a single `StringBuffer` shared across multiple JVMs? That's simply not how objects work... – Jon Skeet Mar 26 '15 at 21:36
  • thanks @JonSkeet for a quick response. I have named it fileLogger because ultimately, the logs are going to be written to a file. This class also has a method saveLogFile() which I have omitted for privacy reasons, which actually reads the StringBuffer and writes to the file. To answer, why I am using StringBuffer, because I didn't want to write to a database or to a file, every time as that could be a heavy operation when the file contains a few 100 thousand records - meaning a more than a few 100 thousands times writing operation to the file or database. So just wanted to try some in memory. – cooler Mar 26 '15 at 21:41
  • and @JonSkeet, may be I am totally clueless about how the app behaves in a multi instance set up. As per you, every instance will have an instance of FileLogger class and hence a instance of the StringBuffer logAppender. Is that correct? I thought amazon ec2 will make sure I have only one instance somehow, since that is a static field, but I should be wrong, because that is not the case? – cooler Mar 26 '15 at 21:47
  • 1
    Unless Amazon is doing something very fancy, there'll be one static field per VM - or more accurately, per classloader. – Jon Skeet Mar 26 '15 at 21:49
  • So what would be a reasonable solution to this. We can not write directly to the file, because the file exist in amazon s3 bucket and amazon doesn't allow us to do that. We can neither use a database, as for every file, we would have to write more than a million lines in the database table, read and then write to the file. Can you suggest a solution that is scalable in future? – cooler Mar 27 '15 at 19:02
  • I'd suggest logging to a local file, then syncing it periodically. – Jon Skeet Mar 27 '15 at 19:03
  • If you need it in a single file, you'll have to write your own log server or use an existing one. – laurent Mar 28 '15 at 08:50

2 Answers2

1

First of all, you are using à static field in a non static method : this is bad practice, either use static methods or use singleton with non static field.

Second, if you are worried about performance, you'd better remove all those synchronized when they are not needed. You only need it on the log() method. Error(), warn() and info() do not access shared data in a way that requires synchronization (read only). Also, why call getLogAppender() multiple times in log() ?

Testing the log level with equalsIgnoreCase() is also inefficient, although it will not be noticeable. And you have a bug in error() : it does not log if level is INFO.

Now, to your point : like said by other before, JVMs do not share user data between instances. And the JVMs are probably not event on the same OS instance. There is no easy way out. Logging to a DB may not be that bad for performances if you configure it well : you need no transaction isolation at all since you do not update any existing data. You could also try to use a log server : unix-like OSes have plenty of solutions to offer. Or when you flush your buffer, flush from all your server instances.

laurent
  • 674
  • 9
  • 14
0

I just wanted to come back and mention that as a solution, I had finally chosen Amazon's elastiCache (redis implementation) for storing my logs temporarily and then at the end of the operation, read all from cache and write to a file in amazon s3. Hope this will help.

cooler
  • 753
  • 4
  • 9
  • 18