.NET: Decompressing zip file entries into memory

I knew that the J# libraries in .NET had zip file support, but I couldn't find any samples that showed how to decompress the files into memory. The hard part, of course, is that the J# stream objects aren't the same as the .NET stream objects. If you're a Java programmer looking for a familiar library, that's great, but I'm not, so I had to do a little finagling.

The first thing you need to do is to add a reference to the vjslib assembly, which brings in .NET classes in Java namespaces, e.g. java.io. The one we care most about is java.uti.zip, which includes ZipFile and ZipEntry. We also need java.util for the Enumeration class and java.io for the InputStream class. With these in place, we can enumerate a zip file:

using java.util; // all from vjslib assembly
using java.util.zip;
using java.io;
...
static void Main(string[] args) {
  if( args.Length != 1 ) {
    Console.WriteLine("Usage: dumpzipfileoftextfiles <file>");
    return;
  }

  // we're assuming a zip file full of ASCII text files here
  string filename = args[0];
  ZipFile zip = new ZipFile(filename);

  try {
    // enumerate entries in the zip file
    // NOTE: can't enum via foreach -- Java objects don't support it
    Enumeration entries = zip.entries();
    while( entries.hasMoreElements() ) {
      ZipEntry entry = (ZipEntry)entries.nextElement();

      // read text bytes into an ASCII string
      byte[] bytes = ReadZipBytes(zip, entry);
      string s = ASCIIEncoding.ASCII.GetString(bytes);

      // do something w/ the text
      string entryname = entry.getName();
      Console.WriteLine("{0}:\r\n{1}\r\n", entryname, s);
    }
  }
  finally {
    if( zip != null ) { zip.close(); }
  }
}

Notice the use of the Enumeration object so we can enumerate in the Java style and the use of the ZipFile and ZipEntry types. This is all stuff you could find in readily available online samples (I did). The interesting bit is the ReadZipBytes method:

static byte[] ReadZipBytes(ZipFile zip, ZipEntry entry) {
  // read contents of text stream into bytes
  InputStream instream = zip.getInputStream(entry);
  int size = (int)entry.getSize();
  sbyte[] sbytes = new sbyte[size];

  // read all the bytes into memory
  int offset = 0;
  while( true ) {
    int read = instream.read(sbytes, offset, size - offset);
    if( read == -1 ) { break; }
    offset += read;
  }
  instream.close();

  // this is the magic method for converting signed bytes
  // in unsigned bytes for use with the rest of .NET, e.g.
  // Encoding.GetString(byte[]) or new MemoryStream(byte[])
  return (byte[])(object)sbytes;
}

For those of you familiar with Java, I'm just reading the zip file entry data into an array of signed bytes. However, most .NET APIs like unsigned bytes, e.g. "Encoding.GetString(byte[])" or "new MemoryStream(byte[])", which means you've got to convert a signed array of bytes in .NET to an unsigned array of bytes. Unfortunately, just casting doesn't work (the compiler complains). Even more unfortunately, I could find nothing in the Convert or BitConverter classes to perform this feat of magic and the code I wrote was dog slow, so I asked around internally.

Luckily, James Manning, an MS SDE, had the answer: cast the signed byte array to an object first and then to a unsigned byte array. Thank goodness James knew that, because I didn't find anything on this topic. Hopefully future generations will find this missive.

You can download the sample if you like. Enjoy.



17 comments on this post

Aaron:


Why J# and its Decompression rather then 2.0's decompression?

Sunday, Feb 4, 2007, 11:01 AM


Mark Allanson:


Surely this begs the question, why wasn't zip support written as a primary dotnetfx class but instead tucked away hidden inside the J# libs?

Sunday, Feb 4, 2007, 11:41 AM


Ross:


..or why you wouldn't just use SharpZipLib
http://sharpdevelop.net/OpenSource/SharpZipLib/Default.aspx

It is GPL but has an exception that allows use in closed-source commercial apps.

Sunday, Feb 4, 2007, 11:58 AM


Chris Sells:


How does .NET 2.0 provider for zip file decompression? That's new to me.

And why would I use an external library when I can use one that's built in?

Sunday, Feb 4, 2007, 7:49 PM


Martin Bennedik:


.NET 2.0 has support for .gzip via System.IO.Compression. I think this is used for HTTP compression in ASP.NET.

.NET 3.0 has support for Zip files in System.IO.Packaging. I think this is used for XAML package files.

Monday, Feb 5, 2007, 1:33 AM


Mike Goatly:


I suspecy Aaron is talking about the classes in the System.IO.Compression namespace, although this doesn't output actual zip files, it just compresses/decompresses data to/from a stream.

Monday, Feb 5, 2007, 5:48 AM


Anonymous Coward:


None of the mentioned "zip supports" actually works when it comes to all zips. For example ones that you find in some old archive tapes or diskettes or even backup CD's from 10 years ago.

As this issue was important to me once I used the .net wrapper for InfoZip, however now I notice in wikipedia that even it doesn't deal with all the past shrinking/compression methods zip has used in the past.

Monday, Feb 5, 2007, 11:49 AM


David Stone:


And you would use an external library because dropping in an assembly is a much better deployment option than installing the VJ# runtime. ;)

Monday, Feb 5, 2007, 11:47 PM


Chris Sells:


It's my understand that the packaging stuff can't be used with vanilla zip files.

and I thought the J# libraries came with .NET 2.0 out of the box?

Thursday, Feb 8, 2007, 10:40 PM


mopoxeb:


[url="http://hometown.aol.com/girls983701780/my-husband-is-addicted-to-writing-porn-stories.htm"]my husband is addicted to writing porn stories[/url]

Saturday, Feb 10, 2007, 8:19 PM


Aaron:


.Net 2.0 has System.io.compression. I think Dino Chiesa wrote the zip archive code http://blogs.msdn.com/dotnetinterop/archive/2006/04/05/.NET-System.IO.Compression-and-zip-files.aspx
Which is better then needing the distribute the Java(!$#%)(er, fun stuff).

Friday, Feb 16, 2007, 7:01 PM


Norfy:


Excellent, up to now I've had to use Buffer.BlockCopy

Tuesday, Feb 20, 2007, 12:17 AM


nwvuzt zydx:


jdrvfopa spglqn iojh lasxkmz hjcydztko hqwuad oirlw

Thursday, Mar 8, 2007, 11:50 PM


usctdghb ikjmops:


mtcvryf bntphdja sqdpynvwf hrbptxq jusqgr yscxm cfovpyie http://www.bqmfe.fvpkmya.com

Thursday, Mar 8, 2007, 11:51 PM


nrujtdf ubvyrnwpm:


iuqym luydfbjmc iqmfdh jurw zesxgchmw uifyjrn tmuvsg [URL=http://www.htvduzgiy.muxrbho.com]cuby pibmaf[/URL]

Thursday, Mar 8, 2007, 11:52 PM


Felix Baxter:


Has anyone done the reverse of this compress zip file in memory. In other words I have C# Byte[] in my db and want to put them in a zip file before writting to disk.

Thanks,

Thursday, Jan 31, 2008, 6:16 AM


DotNetZip does in-memory transforms:


http://blogs.msdn.com/dotnetinterop/archive/2008/02/05/dotnetzip-open-source-zip-library-for-net-applications-revs-to-v1-3.aspx

Tuesday, Mar 18, 2008, 11:51 AM





comment on this post

HTML tags will be escaped.

Powered By ASP.NET

Hosted by SecureWebs

Microsoft

Mensa

IEEE


Best CD Rates
moving companies
addiction treatment
sunglasses
Kratom
How To Lose Weight Fast
cocktail dresses
Credit Card Balance Transfer
Add URL
Stock Trading
Health Insurance Quotes
Promotional Merchandise
Jet Privé
loans for bad credit