I wrote my own archive format, so let me give you some advice. If you are in need of any specific pointers, just ask away and I'll chip in some more advice.

The Basics

You need:
  • A header record.
  • File list records.
  • A way to save file names easily.


So, you must decide if this is the kind of archive you regenerate every time you add files, and therefore change internal orders every time, or if you need to be able to add new files. If you regenerate it every time you can sort the records and file names, and so use a binary search to find the items.

How to Save Directories

Well, the obvious way is to use a recursive search. If you do this, then you know the source directory and can easily transform the absolute filenames you get into relative ones. Obviously this doesn't include how you store these directories internally. Frankly, the easiest way is to make the filename "my dir\file.txt" within the data structure, but it isn't as elegant as something like WinRAR or WinZIP.

Here's a fast recursive search I wrote as an example:
Code:
procedure TIAWriter.AddFiles(const directory: string; Options: TAddFilesOptions
 = [afoRecurse, afoIgnoreHidden]);
var
  SearchRec: TSearchRec;
  Dir: string;

  procedure SearchSubDir(const sub: string);
  var SearchRec2: TSearchRec;
      temp: string;
  begin
    if FindFirst(dir+sub+'\*.*',faAnyFile,SearchRec2) = 0 then
      repeat
        if (afoIgnoreHidden in Options) and (SearchRec2.Attr and faHidden > 0) then
          Continue;
        temp := IncludeTrailingPathDelimiter(sub) + SearchRec2.Name;
        if &#40;SearchRec2.Name <> '.'&#41; and &#40;SearchRec2.Name <> '..'&#41; then
          if &#40;SearchRec2.Attr and faDirectory = 0&#41; then
            AddFile&#40;dir + temp, temp, afoEncrypt in Options&#41;
          else
            SearchSubDir&#40;temp&#41;;
      until FindNext&#40;SearchRec2&#41; <> 0;
  end;

begin
  fHeader.arcEncrypted &#58;= afoEncrypt in Options;
  dir &#58;= IncludeTrailingPathDelimiter&#40;directory&#41;;
  if FindFirst&#40;dir+'*.*',faAnyFile,SearchRec&#41; = 0 then
    repeat
      if &#40;afoIgnoreHidden in Options&#41; and &#40;SearchRec.Attr and faHidden > 0&#41; then
        Continue;
      if &#40;SearchRec.Name <> '.'&#41; and &#40;SearchRec.Name <> '..'&#41; then
        if &#40;SearchRec.Attr and faDirectory = 0&#41; then
          AddFile&#40;dir + SearchRec.Name,SearchRec.Name,afoEncrypt in Options&#41;
        else if afoRecurse in Options then
          SearchSubDir&#40;SearchRec.Name&#41;;
    until FindNext&#40;SearchRec&#41; <> 0;
  FindClose&#40;SearchRec&#41;;
end;
Cheating in the file records and offsets...

This is really easy, and I'm glad I figured it out early when writing my format.

The trick is to structure the archive like so:
Code:
HEADER
----------------
FILES
----------------
FILE RECORDS
The second trick is to assemble the archive procedurally. Create your stream, write a blank header, and then compress each file and write it to the stream. Of course you should be making the file list the whole time, and storing the compressed length plus the file offset in the stream (before you write the file to the stream). Then when you're done writing the files write the position to the header's file list offset. Write the file list, seek to the beginning and write your real header ... and close.

It's quite simple really. If it's hard to follow I'll give you a numbered list.

Misc. Tips

Don't:
  • Make the archive solid, because all seek orders you do will require the archive to be decompressed every time. All additional writes will take slightly more time each time as the archive becomes large.
  • Be indecisive about your requirements, they are excruciatingly difficult to factor in to your code if they involve a major methodology change; sometimes a full rewrite if you're sloppy.


Edit: Forgot about the code mangling if HTML wasn't disabled ... so I fixed that and disabled it.