Skip to content

Zip

๐Ÿค– AI-Generated Content

This documentation was generated with AI assistance and is still being audited. Some, or potentially a lot, of this information may be inaccurate. Learn more.

provide.foundation.archive.zip

Classes

ZipArchive

Bases: BaseArchive

ZIP archive implementation.

Creates and extracts ZIP archives with optional compression. Supports adding files to existing archives.

Security Note - Password Handling

The password parameter only decrypts existing encrypted ZIP files during extraction/reading. It does NOT encrypt new files during creation with stdlib zipfile. To create encrypted ZIP archives, use a third-party library like pyzipper that supports AES encryption. The stdlib zipfile.setpassword() method only enables reading password-protected archives.

Attributes:

Name Type Description
compression_level int

ZIP compression level 0-9 (0=store/no compression, 9=best)

compression_type int

Compression type (zipfile.ZIP_DEFLATED, etc)

password bytes | None

Password for decrypting existing encrypted archives (read-only)

Functions
add_file
add_file(
    archive: Path, file: Path, arcname: str | None = None
) -> None

Add file to existing ZIP archive.

Parameters:

Name Type Description Default
archive Path

ZIP archive file path

required
file Path

File to add

required
arcname str | None

Name in archive (defaults to file name)

None

Raises:

Type Description
ArchiveError

If adding file fails

Source code in provide/foundation/archive/zip.py
def add_file(self, archive: Path, file: Path, arcname: str | None = None) -> None:
    """Add file to existing ZIP archive.

    Args:
        archive: ZIP archive file path
        file: File to add
        arcname: Name in archive (defaults to file name)

    Raises:
        ArchiveError: If adding file fails

    """
    try:
        with zipfile.ZipFile(archive, "a", compression=self.compression_type) as zf:
            if self.password:
                zf.setpassword(self.password)

            zf.write(file, arcname or file.name)

        log.debug(f"Added {file} to ZIP archive {archive}")

    except OSError as e:
        raise ArchiveIOError(f"Failed to add file to ZIP (I/O error): {e}") from e
    except Exception as e:
        raise ArchiveError(f"Failed to add file to ZIP: {e}") from e
create
create(source: Path, output: Path) -> Path

Create ZIP archive from source.

Parameters:

Name Type Description Default
source Path

Source file or directory to archive

required
output Path

Output ZIP file path

required

Returns:

Type Description
Path

Path to created archive

Raises:

Type Description
ArchiveError

If archive creation fails

Note

Files are NOT encrypted during creation even if password is set. The stdlib zipfile module does not support creating encrypted archives. Use pyzipper or similar for AES-encrypted ZIP creation.

Source code in provide/foundation/archive/zip.py
def create(self, source: Path, output: Path) -> Path:
    """Create ZIP archive from source.

    Args:
        source: Source file or directory to archive
        output: Output ZIP file path

    Returns:
        Path to created archive

    Raises:
        ArchiveError: If archive creation fails

    Note:
        Files are NOT encrypted during creation even if password is set.
        The stdlib zipfile module does not support creating encrypted archives.
        Use pyzipper or similar for AES-encrypted ZIP creation.

    """
    try:
        ensure_parent_dir(output)

        with zipfile.ZipFile(
            output,
            "w",
            compression=self.compression_type,
            compresslevel=self.compression_level,
        ) as zf:
            if self.password:
                zf.setpassword(self.password)

            if source.is_dir():
                # Add all files in directory
                for item in sorted(source.rglob("*")):
                    if item.is_file():
                        arcname = item.relative_to(source)
                        zf.write(item, arcname)
            else:
                # Add single file
                zf.write(source, source.name)

        log.debug(f"Created ZIP archive: {output}")
        return output

    except OSError as e:
        raise ArchiveIOError(f"Failed to create ZIP archive (I/O error): {e}") from e
    except Exception as e:
        raise ArchiveError(f"Failed to create ZIP archive: {e}") from e
extract
extract(
    archive: Path,
    output: Path,
    limits: ArchiveLimits | None = None,
) -> Path

Extract ZIP archive to output directory with decompression bomb protection.

Parameters:

Name Type Description Default
archive Path

ZIP archive file path

required
output Path

Output directory path

required
limits ArchiveLimits | None

Optional extraction limits (uses DEFAULT_LIMITS if None)

None

Returns:

Type Description
Path

Path to extraction directory

Raises:

Type Description
ArchiveError

If extraction fails, archive contains unsafe paths, or exceeds limits

Source code in provide/foundation/archive/zip.py
def extract(self, archive: Path, output: Path, limits: ArchiveLimits | None = None) -> Path:
    """Extract ZIP archive to output directory with decompression bomb protection.

    Args:
        archive: ZIP archive file path
        output: Output directory path
        limits: Optional extraction limits (uses DEFAULT_LIMITS if None)

    Returns:
        Path to extraction directory

    Raises:
        ArchiveError: If extraction fails, archive contains unsafe paths, or exceeds limits

    """
    if limits is None:
        limits = DEFAULT_LIMITS

    try:
        output.mkdir(parents=True, exist_ok=True)

        # Initialize extraction tracker
        tracker = ExtractionTracker(limits)
        tracker.set_compressed_size(get_archive_size(archive))

        with zipfile.ZipFile(archive, "r") as zf:
            if self.password:
                zf.setpassword(self.password)

            # Validate all members before extraction
            self._validate_zip_members(zf, output, tracker)

            # Check overall compression ratio
            tracker.check_compression_ratio()

            # Extract all (all members have been security-checked above)
            zf.extractall(output)

        log.debug(f"Extracted ZIP archive to: {output}")
        return output

    except (ArchiveError, ArchiveValidationError):
        raise
    except zipfile.BadZipFile as e:
        raise ArchiveFormatError(f"Invalid or corrupted ZIP archive: {e}") from e
    except OSError as e:
        raise ArchiveIOError(f"Failed to extract ZIP archive (I/O error): {e}") from e
    except Exception as e:
        raise ArchiveError(f"Failed to extract ZIP archive: {e}") from e
extract_file
extract_file(
    archive: Path, member: str, output: Path
) -> Path

Extract single file from ZIP archive.

Parameters:

Name Type Description Default
archive Path

ZIP archive file path

required
member str

Name of file in archive

required
output Path

Output directory or file path

required

Returns:

Type Description
Path

Path to extracted file

Raises:

Type Description
ArchiveError

If extraction fails or member path is unsafe

Source code in provide/foundation/archive/zip.py
def extract_file(self, archive: Path, member: str, output: Path) -> Path:
    """Extract single file from ZIP archive.

    Args:
        archive: ZIP archive file path
        member: Name of file in archive
        output: Output directory or file path

    Returns:
        Path to extracted file

    Raises:
        ArchiveError: If extraction fails or member path is unsafe

    """
    try:
        with zipfile.ZipFile(archive, "r") as zf:
            if self.password:
                zf.setpassword(self.password)

            # Enhanced security check
            extract_base = output if output.is_dir() else output.parent
            self._validate_member_path(extract_base, member)

            # Check for symlinks
            info = zf.getinfo(member)
            if info.external_attr:
                self._validate_symlink_if_present(zf, extract_base, info)

            if output.is_dir():
                zf.extract(member, output)
                return output / member
            ensure_parent_dir(output)
            with zf.open(member) as source, output.open("wb") as target:
                target.write(source.read())
            return output

    except (ArchiveError, ArchiveValidationError):
        raise
    except zipfile.BadZipFile as e:
        raise ArchiveFormatError(f"Invalid or corrupted ZIP archive: {e}") from e
    except OSError as e:
        raise ArchiveIOError(f"Failed to extract file from ZIP (I/O error): {e}") from e
    except Exception as e:
        raise ArchiveError(f"Failed to extract file from ZIP: {e}") from e
list_contents
list_contents(archive: Path) -> list[str]

List contents of ZIP archive.

Parameters:

Name Type Description Default
archive Path

ZIP archive file path

required

Returns:

Type Description
list[str]

List of file paths in archive

Raises:

Type Description
ArchiveError

If listing fails

Source code in provide/foundation/archive/zip.py
def list_contents(self, archive: Path) -> list[str]:
    """List contents of ZIP archive.

    Args:
        archive: ZIP archive file path

    Returns:
        List of file paths in archive

    Raises:
        ArchiveError: If listing fails

    """
    try:
        with zipfile.ZipFile(archive, "r") as zf:
            return sorted(zf.namelist())
    except zipfile.BadZipFile as e:
        raise ArchiveFormatError(f"Invalid or corrupted ZIP archive: {e}") from e
    except OSError as e:
        raise ArchiveIOError(f"Failed to list ZIP contents (I/O error): {e}") from e
    except Exception as e:
        raise ArchiveError(f"Failed to list ZIP contents: {e}") from e
validate
validate(archive: Path) -> bool

Validate ZIP archive integrity.

Parameters:

Name Type Description Default
archive Path

ZIP archive file path

required

Returns:

Type Description
bool

True if archive is valid, False otherwise

Note: This method intentionally catches all exceptions and returns False. This is NOT an error suppression case - returning False on any exception is the expected validation behavior. Do NOT replace this with @resilient decorator.

Source code in provide/foundation/archive/zip.py
def validate(self, archive: Path) -> bool:
    """Validate ZIP archive integrity.

    Args:
        archive: ZIP archive file path

    Returns:
        True if archive is valid, False otherwise

    Note: This method intentionally catches all exceptions and returns False.
    This is NOT an error suppression case - returning False on any exception
    is the expected validation behavior. Do NOT replace this with @resilient decorator.
    """
    try:
        with zipfile.ZipFile(archive, "r") as zf:
            # Test the archive
            result = zf.testzip()
            return result is None  # None means no bad files
    except Exception:  # nosec B110
        # Broad catch is intentional for validation: any error means invalid archive.
        # Possible exceptions: zipfile.BadZipFile, OSError, PermissionError, etc.
        return False

Functions