FILE_GZIP

The FILE_GZIP procedure compresses a given input file (or files) using the DEFLATE compression algorithm, and saves the resulting compressed data in the GZIP file format into a new file or to memory. The compression is done using the ZLIB library, written by Jean-Loup Gailly and Mark Adler.

This routine is written in the IDL language. Its source code can be found in the file file_gzip.pro in the lib subdirectory of the IDL distribution.

Examples

Here, we copy a single ASCII text file from the IDL examples directory into our current working directory, GZIP the file, and delete the copy:

file = FILEPATH('irreg_grid2.txt', SUBDIR=['examples','data'])

FILE_COPY, file, 'irreg_grid2.txt'

FILE_GZIP, 'irreg_grid2.txt', /DELETE, /VERBOSE

IDL prints:

% Compress ./irreg_grid2.txt 86.9%

As another example, we can compress a file to a buffer, potentially send the buffer to a different IDL process, and then use ZLIB_UNCOMPRESS to expand the buffer.

file = FILEPATH('irreg_grid2.txt', SUBDIR=['examples','data'])

FILE_GZIP, file, BUFFER=buffer

HELP, buffer

; ...send the buffer somewhere else, then uncompress it...

data = ZLIB_UNCOMPRESS(buffer, /GZIP)

HELP, data

IDL prints:

BUFFER BYTE = Array[1870]

DATA BYTE = Array[14213]

Syntax

FILE_GZIP, File [, FileOut] [, BUFFER=variable] [, /CLOSE] [, COUNT=variable] [, /DELETE] [, NBYTES=value] [, OFFSET=value] [, /VERBOSE]

Arguments

File

Set this argument to a string or array of strings giving the file or files to compress. If File is an array, IDL compresses each file independently and saves it in a separate file.

FileOut

Set this optional argument to a string or array of strings giving the output filenames (including the full path). FileOut must have the same number of elements as File. If you do not provide FileOut, then IDL constructs the output filenames by appending the file suffix ".gz" to each filename. In addition, if you do not provide FileOut, then FILE_GZIP will save each file into the same directory as the original file.

You do not typically need to supply the FileOut argument unless you want to create the compressed file in a different directory. If you do supply FileOut, be sure to use the full name of the original file, including the original extension. Otherwise, when the file is uncompressed, it will be given the wrong name. For example, if File is named "myfile.txt", be sure to specify FileOut as "myfile.txt.gz".

Keywords

BUFFER

Setting this keyword to a named variable causes IDL to store the compressed data in the variable rather than within a file. If this keyword is present, do not provide the FileOut argument. The BUFFER keyword can only be used with a single input file, not multiple files. If there is no data to return then BUFFER will contain a scalar 0, otherwise BUFFER will be a byte array.

Note: The BUFFER keyword can not be used with the CLOSE keyword. In other words, you can not stream compressed data from a file to a memory buffer.

CLOSE

By default, the FILE_GZIP routine closes the input and output files when the routine finishes. However, if you use the NBYTES or OFFSET keywords, you can set CLOSE=0 to keep the files open. You can then call FILE_GZIP again with the same file names, and IDL will continue reading and writing at the current offset within the files. When you finish reading and writing, you must then call FILE_GZIP once more with /CLOSE.

COUNT

Set this keyword to a named variable to return the number of bytes that were compressed. If File is an array then the COUNT will be an array of integers.

Tip: If you have set the NBYTES keyword to read only a portion of the input file, then the returned COUNT value will be equal to NBYTES until IDL reaches the end of the file. When the end is reached, the returned COUNT value will be less than NBYTES. You can use this behavior to determine if you have reached the end of the input file and should then call FILE_GZIP with /CLOSE.

DELETE

By default, the FILE_GZIP routine preserves the original files. Set the DELETE keyword to delete the original files.

Note: When multiple files are being compressed, IDL deletes files as they are compressed. If an error occurs, any original files that have been compressed will already have been deleted.

NBYTES

Set the NBYTES keyword to an integer giving the number of bytes to compress. By default, FILE_GZIP compresses all of the data within the file. If the file is already open (with CLOSE=0 in a previous call) then by default NBYTES is set to read all of the remaining data within the file.

Note: You can not use the NBYTES keyword with multiple files.

OFFSET

Set the OFFSET keyword to an integer giving the offset within File at which to start reading and compressing data. By default, FILE_GZIP starts at the beginning of the file (or at the current file position if CLOSE=0 was used to keep the file open in a previous FILE_GZIP call).

Note: You can not use the OFFSET keyword with multiple files.

VERBOSE

Set this keyword to output additional information while the routine is executing.

Additional Examples

Compress a Huge File Using Chunks

You can use the CLOSE, COUNT, and NBYTES keywords to compress a huge file using small-size chunks. For example:

; Create a random data file

data = BYTSCL(RANDOMU(1,20000), TOP=10)

OPENW, lun, 'test_data.dat', /GET_LUN

WRITEU, lun, data

FREE_LUN, lun

 

nbuffer = 4096

count = nbuffer

while (count ne 0) do begin & $

FILE_GZIP, 'test_data.dat', 'test_data.dat.gz', CLOSE=0, $

COUNT=count, NBYTES=nbuffer & $

print, ' count =',count & $

endwhile

FILE_GZIP, 'test_data.dat', 'test_data.dat.gz', /CLOSE

IDL prints:

count = 4096

count = 4096

count = 4096

count = 4096

count = 3616

count = 0

Compress a Portion of a File

You can use the NBYTES and OFFSET keywords to only compress a certain portion of a file, and in addition, use the BUFFER keyword to return just that compressed portion to memory:

; Create a random data file

data = BYTSCL(RANDOMU(1,20000), TOP=10)

OPENW, lun, 'test_data.dat', /GET_LUN

WRITEU, lun, data

FREE_LUN, lun

 

FILE_GZIP, 'test_data.dat', BUFFER=buffer, $

NBYTES=10, OFFSET=1000

HELP, buffer

 

; Uncompress the buffer and verify the result

PRINT, ZLIB_UNCOMPRESS(buffer, /GZIP)

PRINT, data[1000:1009]

IDL prints:

BUFFER BYTE = Array[36]

0 4 2 3 3 1 1 9 6 1

0 4 2 3 3 1 1 9 6 1

Version History

8.2.3

Introduced

See Also

FILE_GUNZIP, FILE_TAR, FILE_UNTAR, FILE_ZIP, FILE_UNZIP, ZLIB_COMPRESS, ZLIB_UNCOMPRESS