Wednesday, March 30, 2011

Automatically delete Tomcat access log old files

On the post Enable Tomcat acess log we saw how Tomcat can be easily configured to log all HTTP requests coming to our server.
For some reason, Tomcat doesn't have a mechanism that also makes sure to delete old access log files. This is a bit weird, because I assume that Tomcat is using Log4j in order to log the requests, and Log4j has a built in mechanism for rotating and deleting old log files.
Anyway, since Tomcat doesn’t take care of all these access log files that are being added, and since a typical production server may produce access log files of several giga per day, your disk might get out of space pretty quick.
we will build a small class that will handle old access log files deletion.
First, we need to get the following information:
  • Tomcat access log files directory.
  • Access log file structure. The default structure is: localhost_access_log.yyyy-MM-dd.txt
  • Access log date format: The default structure is: yyyy-MM-dd
  • Number of access log files we would like to keep as backup. For example, if we choose: 10 it means that we would like to keep the last 10 access log files.
The idea is very simply:
  • We get all access log files on the directory.
  • We extract the date of each log file.
  • We sort all the files on ascending order.
  • We delete the oldest files making sure to keep the requested number of backups.
Let’s have a look at our access log files deletion class. It’s code is quite simple:
package com.bashan.blog.log;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import java.io.File;
import java.io.FilenameFilter;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
* @author Bashan
*/
public class TomcatAccessLogCleaner extends TimerTask {
  private static final Log log = LogFactory.getLog(TomcatAccessLogCleaner.class);
  private static final String DEFAULT_LOG_FILE_PATTERN = "localhost_access_log\\.yyyy-MM-dd\\.txt";
  private static final String DEFAULT_DATE_FORMAT = "yyyy-MM-dd";
  private String dateFormat;
  private Pattern logFilePat;
  private File logFileDir;
  private int numBackups;
 
  public TomcatAccessLogCleaner(File logFileDir, int numBackups) {
    this(logFileDir, DEFAULT_LOG_FILE_PATTERN, DEFAULT_DATE_FORMAT, numBackups);
}
 
  public TomcatAccessLogCleaner(File logFileDir, String logFilePattern, String dateFormat, int numBackups) {
    this.dateFormat = dateFormat;
    this.logFileDir = logFileDir;
String pat = logFilePattern.replace(dateFormat, "(.+?)");
logFilePat = Pattern.compile(pat);
    this.numBackups = numBackups;
}
 
  public void clean() {
log.info("Starting to clean old Tomcat access logs. Number of backups to keep: " + numBackups);
File[] files = logFileDir.listFiles(new FilenameFilter() {
      public boolean accept(File dir, String file) {
        return logFilePat.matcher(file).matches();
}
});
List<LogFile> logFiles = new ArrayList<LogFile>(files.length);
    for (File file : files) {
      try {
LogFile logFile = new LogFile(file, logFilePat, dateFormat);
logFiles.add(logFile);
}  
      catch (ParseException pe) {
}
 
Collections.sort(logFiles, new Comparator<LogFile>() { 
@Override
      public int compare(LogFile o1, LogFile o2) {
        return o1.getLogDate().compareTo(o2.getLogDate());
}
});
 
    int numFilesToClean = logFiles.size() - numBackups;
    int removed = 0;
    for (int i = 0; i < numFilesToClean; i++) {
LogFile logFile = logFiles.get(i);
log.debug("Deleting access log file: " + logFile);
      if (!logFile.getFile().delete()) {
log.warn("Failed deleting log file");
}
}
log.info("Finished cleaning old Tomcat access logs. Total log files: " +
logFiles.size() + ". Deleted: " + removed + " of " + Math.max(0, numFilesToClean));
}
 
public static class LogFile {
  private File file;
  private Date logDate;
  public LogFile(File file, Pattern pattern, String dateFormat) throws ParseException {
Matcher matcher = pattern.matcher(file.getName());
    if (matcher.find()) {
String dateStr = matcher.group(1);
SimpleDateFormat sdf = new SimpleDateFormat(dateFormat);
logDate = sdf.parse(dateStr);
      this.file = file;
}
}
 
  public File getFile() {
    return file;
}
 
  public void setFile(File file) {
    this.file = file;
 
  public Date getLogDate() {
    return logDate;
}
 
  public void setLogDate(Date logDate) {
    this.logDate = logDate;
}
 
  public void run() {
clean();
}
}
Note that TomcatAccessLogCleaner extends TimeTask. This enables easily using this class with a timer, allowing it to run every fix interval, and clean Tomcat access log files.
You can download this class here.
Let’s have a look of an example, showing how this class can be used in a servlet. The class will be scheduled to run automatically every 24 hours and clean old access log files. The class will be scheduled once when servlet loads. Don’t forget that you have to map servlets on your web.xml:
package com.bashan.blog.log;
import javax.servlet.ServletConfig;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import java.io.File;
import java.util.Timer;
/**
* @author Bashan
*/
public class TomcatAccessLogCleanerServlet extends HttpServlet {
  public void init(ServletConfig config) throws ServletException {
    super.init(config);
Timer time = new Timer();
TomcatAccessLogCleaner tomcatAccessLogCleaner = new TomcatAccessLogCleaner(new File("c:\\tomcat access log dir"), 10);
time.scheduleAtFixedRate(tomcatAccessLogCleaner, 0, 1000 * 60 * 60 * 24);
}
}
You can download the servlet here.

6 comments:

  1. I am very poor in tomcat so pl let me know where to keep this code

    package com.bashan.blog.log;

    import javax.servlet.ServletConfig;

    import javax.servlet.ServletException;

    import javax.servlet.http.HttpServlet;

    import java.io.File;

    import java.util.Timer;

    /**

    * @author Bashan

    */

    public class TomcatAccessLogCleanerServlet extends HttpServlet {

    public void init(ServletConfig config) throws ServletException {

    super.init(config);

    Timer time = new Timer();

    TomcatAccessLogCleaner tomcatAccessLogCleaner = new TomcatAccessLogCleaner(new File("c:\\tomcat access log dir"), 10);

    time.scheduleAtFixedRate(tomcatAccessLogCleaner, 0, 1000 * 60 * 60 * 24);

    }

    }

    ReplyDelete
  2. I downloaded servlet and class files, now please let me know the path to keep those 2files.

    ReplyDelete
  3. You can simply put a full url in some conf file.

    ReplyDelete
  4. Dont you think its good to invoke as ServletContextListener ? Rather as servlet

    ReplyDelete
  5. This code should be located in a servlet class. You can read more on servlets on the web.

    ReplyDelete