Sunday, May 17, 2009

Recursivly Zipping a directory in Java – Extending capabilities

In this post: Recursivly Zipping directory in Java we saw how we can use Java to easily zip all the contents of a directory including its subdirectories. In this post we will extend the ZipUtil class (talked on the mentioned post) to have more capabilities:

  • Accept FileFilter allowing to zip only desired files from a directory.
  • Get notifications on zip process:
    • Zip start.
    • Zip ended.
    • Before processing a directory.
    • Before processing a file.

These features allows us to gain better control over the Zip processes. The “Zip start” and “Zip end” notifications are not so necessary, since you are very well aware in the code when you start to zip a directory and when the zipping processes is ended. These methods are added as a refinement.

In order to implement these requirements the ZipUtil class will be converted to a ZipDir class, and all its static methods will be converted to a non static methods. This is required, since as much as static methods are a nice way of easily supplying utility functions, they do not give us the real benefit of object oriented programming: inheritance.

The logic behind the zip process remains pretty much the same as it was on the previous zip related post, but instead of one loop iterating directories and files, there will be 2 loops: one iterates directories and the second iterates files in a specific directory. This is needed, since we would like to get all files in a directory using a FileFilter. FileFilter in java is a simple interface with one method needed to be implemented: accept(File pathname). The accept method determines if a file should or shouldn’t be included in the result of calling the method listFiles of the File class.

In addition, 4 method are added to the ZipDir class:

  • beforeZip(): This method is called before the Zip process is started.
  • afterZip(): This method is called after the Zip process is ended.
  • beforeProcessDirectory(File dir): This method is called before processing every new directory. If this method returns false for a specific directory, this directory and all its sub directories will not be processed.
  • beforeProcessFile(File file): This method is called before zipping every new file. If this method returns false the specific file will not be zipped.

After these changes, the new ZipDir class looks like this:

package com.bashan.blog.zip;
import java.io.*;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public class ZipDir {
  private static final int BUFFER_SIZE = 1024 * 4;
  protected String dir;
  protected FileFilter fileFilter;
  protected ZipOutputStream zipOutputStream;
  int relativePos;
  public ZipDir(String dir, FileFilter fileFilter, OutputStream outputStream) {
    this.dir = dir;
    this.fileFilter = fileFilter;
    this.zipOutputStream = new ZipOutputStream(outputStream);
    relativePos = dir.length() + 1;
  }
  public ZipDir(String dir, OutputStream outputStream) {
    this(dir, null, outputStream);
  }
  public ZipDir(String dir, FileFilter fileFilter, String filenameZipOut) throws IOException {
    this(dir, fileFilter, new FileOutputStream(filenameZipOut));
  }
  public ZipDir(String dir, String filenameZipOut) throws IOException {
    this(dir, null, filenameZipOut);
  }
  public void zip() throws IOException {
    try {
      beforeZip();
      zip(new File(dir));
      afterZip();
    }
    finally {
      if (zipOutputStream != null) {
        zipOutputStream.close();
      }
    }
  }
  private void zip(File dir) throws IOException {
    if (dir.exists() && dir.isDirectory()) {
      File[] dirs = dir.listFiles(new FileFilter() {
        public boolean accept(File file) {
          return file.isDirectory();
        }
      });
      for (File currentDir : dirs) {
        if (beforeProcessDirectory(currentDir))
        {
          zip(currentDir);
        }
      }
      File[] files = dir.listFiles(fileFilter);
      for (File file : files) {
        if (file.isDirectory()) {
          continue;
        }
        if (beforeProcessFile(file))
        {
          InputStream in = null;
          try {
            String filename = file.getPath();
            zipOutputStream.putNextEntry(new ZipEntry(filename.substring(relativePos)));
            in = new FileInputStream(file);
            int len;
            byte[] buf = new byte[BUFFER_SIZE];
            while ((len = in.read(buf)) != -1) {
              zipOutputStream.write(buf, 0, len);
            }
            zipOutputStream.closeEntry();
          }
          finally {
            if (in != null) {
              in.close();
            }
          }
        }
      }
    }
  }
  protected boolean beforeProcessDirectory(File dir)
  {
    return true;
  }
  protected boolean beforeProcessFile(File file)
  {
    return true;
  }
  protected void beforeZip()
  {
  }
  protected void afterZip()
  {
  }
}
Here are some examples of using the ZipDir class:

Zip all files in a directory. No filtering used:

new ZipDir("/somedir", "/somezip.zip").zip();

Zip all icon files in a directory. Showing example of using a FileFilter:

    new ZipDir("/somedir",
        new FileFilter() {
          public boolean accept(File file) {
            return file.isFile() && file.getPath().endsWith("ico");
          }
        }, "/somezip.zip").zip();

Zip all files in a directory excluding directories ending with “def” string. Showing the usage of beforeProcessDirectory(File dir) method:

    new ZipDir("/somedir", "/somezip.zip")
    {
      @Override
      protected boolean beforeProcessDirectory(File dir)
      {
        return !dir.getPath().endsWith("def");
      }
    }.zip();

The above examples use java anonymous class to both implement a FileFilter and ZipDir class. We can also extend the ZipDir class in the traditional way, to give it more capabilities as we wish. For example, this is extension to the ZipDir class that collects small statistical information on the zipping process:

package com.bashan.blog.zip;
import java.io.File;
import java.io.FileFilter;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Date;
public class StatZipDir extends ZipDir {
  protected int numDirs;
  protected int numFiles;
  protected Date zipStart;
  protected Date zipEnd;
  public StatZipDir(String dir, FileFilter fileFilter, OutputStream outputStream) {
    super(dir, fileFilter, outputStream);
  }
  public StatZipDir(String dir, OutputStream outputStream) {
    super(dir, outputStream);
  }
  public StatZipDir(String dir, FileFilter fileFilter, String filenameZipOut) throws IOException {
    super(dir, fileFilter, filenameZipOut);
  }
  public StatZipDir(String dir, String filenameZipOut) throws IOException {
    super(dir, filenameZipOut);
  }
  public int getNumDirs() {
    return numDirs;
  }
  public int getNumFiles() {
    return numFiles;
  }
  public Date getZipStart() {
    return zipStart;
  }
  public Date getZipEnd() {
    return zipEnd;
  }
  @Override
  protected boolean beforeProcessDirectory(File dir) {
    numDirs++;
    return super.beforeProcessDirectory(dir);
  }
  @Override
  protected boolean beforeProcessFile(File file) {
    numFiles++;
    return super.beforeProcessFile(file);
  }
  @Override
  protected void beforeZip() {
    zipStart = new Date();
  }
  @Override
  protected void afterZip() {
    zipEnd = new Date();
  }
  public static void main(String[] args) throws IOException {
    StatZipDir statZipDir = new StatZipDir("/someDir", "zomezip.zip");
    statZipDir.zip();
    System.out.println("Total zipped directories: " + statZipDir.getNumDirs());
    System.out.println("Total zipped files: " + statZipDir.getNumFiles());
    System.out.println("Total zip time: " + (statZipDir.getZipEnd().getTime() -
        statZipDir.getZipStart().getTime()) / 1000 + " seconds");
  }
}
At the bottom of this class there is a small test program outputting the number of directories and files that were zipped as well as the total time in seconds the zip processes took.

You can find the code for ZipDir class here and the code doe the StatZipDir class here.

No comments:

Post a Comment