How to Log Exceptions¶
Properly logging exceptions is crucial for debugging and operational visibility. provide.foundation provides simple and powerful ways to capture error context with full stack traces and structured metadata.
Overview¶
Exception logging serves multiple purposes: - Debugging - Understand what went wrong and where - Monitoring - Alert on error rates and patterns - Auditing - Track failures for compliance - Context - Preserve relevant data for post-mortem analysis
Foundation's structured logging ensures exceptions are logged with rich context, making them easy to search, filter, and analyze in log aggregation systems.
Basic Exception Logging¶
Using logger.exception()¶
The logger.exception() method is the preferred way to log exceptions. It should be called from within an except block and automatically captures the full stack trace.
from provide.foundation import logger
def risky_operation():
raise ValueError("Something went wrong")
try:
risky_operation()
except Exception:
logger.exception(
"Operation failed unexpectedly",
operation_name="risky_operation",
user_id="user_xyz",
)
Output includes:
- Event message: "Operation failed unexpectedly"
- Structured fields: operation_name, user_id
- Full stack trace with file names and line numbers
- Exception type and message
Using logger.error() with exc_info¶
For more control, you can use logger.error() and pass exc_info=True to include the traceback.
try:
risky_operation()
except Exception as e:
logger.error(
"Operation failed",
exc_info=True,
error_type=type(e).__name__,
error_details=str(e),
)
Exception Logging Patterns¶
Pattern 1: Log and Re-raise¶
Log the exception for visibility, then re-raise it for upstream handling:
def process_payment(transaction):
"""Process payment with exception logging."""
try:
payment_gateway.charge(transaction)
except PaymentError as e:
logger.exception(
"Payment processing failed",
transaction_id=transaction.id,
amount=transaction.amount,
error_code=e.code,
)
raise # Re-raise for caller to handle
When to use: When you want visibility at this layer but need upstream code to handle the error.
Pattern 2: Log and Transform¶
Log the original exception, then raise a different exception type:
from provide.foundation.errors import DatabaseError
def save_user(user_data):
"""Save user with exception transformation."""
try:
db.insert("users", user_data)
except ConnectionError as e:
logger.exception(
"Database connection failed during user save",
user_email=user_data.get("email"),
)
raise DatabaseError("Failed to save user") from e
When to use: When converting low-level exceptions to domain-specific exceptions.
Pattern 3: Log and Handle¶
Log the exception and handle it completely (don't re-raise):
def send_notification(user_id, message):
"""Send notification with graceful failure."""
try:
notification_service.send(user_id, message)
except NotificationError as e:
logger.exception(
"Failed to send notification",
user_id=user_id,
notification_type=message.type,
)
# Don't raise - notification failure shouldn't break the flow
return False
return True
When to use: When the operation is optional or has a sensible fallback.
Pattern 4: Log with Retry Context¶
Log exceptions within retry loops to track retry attempts:
from provide.foundation.resilience import retry
@retry(NetworkError, max_attempts=3, base_delay=1.0)
def call_external_api(endpoint):
"""Call API with retry and exception logging."""
try:
response = requests.get(endpoint, timeout=5)
response.raise_for_status()
return response.json()
except requests.RequestException as e:
logger.exception(
"API call failed",
endpoint=endpoint,
status_code=getattr(e.response, 'status_code', None),
attempt="will_retry", # Retry decorator will retry
)
raise
Adding Context to Exceptions¶
User Context¶
Include user information for debugging user-specific issues:
def process_user_action(user_id, action):
"""Process action with user context."""
try:
result = perform_action(action)
except Exception:
logger.exception(
"User action failed",
user_id=user_id,
user_email=get_user_email(user_id),
action_type=action.type,
action_id=action.id,
session_id=get_current_session_id(),
)
raise
Request Context¶
Capture HTTP request details for API debugging:
def handle_api_request(request):
"""Handle API request with request context."""
try:
response = process_request(request)
except Exception:
logger.exception(
"API request processing failed",
method=request.method,
path=request.path,
request_id=request.headers.get("X-Request-ID"),
user_agent=request.headers.get("User-Agent"),
client_ip=request.remote_addr,
)
raise
Business Context¶
Add domain-specific information:
def finalize_order(order):
"""Finalize order with business context."""
try:
payment_result = process_payment(order)
inventory_result = reserve_inventory(order)
ship_order(order)
except Exception:
logger.exception(
"Order finalization failed",
order_id=order.id,
customer_id=order.customer_id,
total_amount=order.total,
order_status=order.status,
payment_status=getattr(payment_result, 'status', 'unknown'),
inventory_status=getattr(inventory_result, 'status', 'unknown'),
)
raise
Correlation IDs¶
Use correlation IDs to track requests across multiple services:
import uuid
from contextvars import ContextVar
# Global context variable for correlation ID
correlation_id: ContextVar[str] = ContextVar("correlation_id", default=None)
def set_correlation_id(cid=None):
"""Set correlation ID for current context."""
if cid is None:
cid = str(uuid.uuid4())
correlation_id.set(cid)
return cid
def get_correlation_id():
"""Get current correlation ID."""
return correlation_id.get()
def api_handler(request):
"""API handler with correlation ID."""
# Extract or generate correlation ID
cid = request.headers.get("X-Correlation-ID") or set_correlation_id()
try:
result = process_request(request)
except Exception:
logger.exception(
"Request processing failed",
correlation_id=cid,
request_path=request.path,
)
raise
return result
Benefits: - Track errors across microservices - Correlate logs from different systems - Debug distributed request flows
Exception Aggregation Patterns¶
Collecting Multiple Failures¶
When processing batches, collect all failures for comprehensive error reporting:
from typing import NamedTuple
class ProcessingResult(NamedTuple):
success_count: int
failure_count: int
errors: list
def process_batch(items):
"""Process batch and collect all failures."""
successes = 0
failures = 0
errors = []
for item in items:
try:
process_item(item)
successes += 1
except Exception as e:
failures += 1
errors.append({
"item_id": item.id,
"error_type": type(e).__name__,
"error_message": str(e),
})
logger.exception(
"Item processing failed",
item_id=item.id,
batch_position=items.index(item),
)
# Log batch summary
if failures > 0:
logger.error(
"Batch processing completed with failures",
total_items=len(items),
successes=successes,
failures=failures,
failure_rate=failures / len(items),
errors=errors[:5], # First 5 errors for visibility
)
return ProcessingResult(successes, failures, errors)
Error Rate Tracking¶
Monitor error rates over time:
from collections import deque
from datetime import datetime, timedelta
class ErrorRateTracker:
"""Track error rate over time window."""
def __init__(self, window_seconds=60):
self.window = timedelta(seconds=window_seconds)
self.errors = deque()
def record_error(self, exception):
"""Record an error occurrence."""
now = datetime.now()
self.errors.append((now, exception))
# Remove old errors outside window
cutoff = now - self.window
while self.errors and self.errors[0][0] < cutoff:
self.errors.popleft()
# Log if error rate is high
error_count = len(self.errors)
if error_count > 10: # More than 10 errors in window
logger.warning(
"High error rate detected",
error_count=error_count,
window_seconds=self.window.total_seconds(),
recent_errors=[
type(e).__name__ for _, e in list(self.errors)[-5:]
],
)
# Global tracker
error_tracker = ErrorRateTracker(window_seconds=60)
def monitored_operation():
"""Operation with error rate monitoring."""
try:
perform_operation()
except Exception as e:
error_tracker.record_error(e)
logger.exception("Operation failed")
raise
Custom Exception Handlers¶
Application-Wide Exception Handler¶
Set up a global exception handler for your application:
import sys
from provide.foundation import logger
def global_exception_handler(exc_type, exc_value, exc_traceback):
"""Handle uncaught exceptions globally."""
# Don't log KeyboardInterrupt
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
logger.critical(
"Uncaught exception",
exc_type=exc_type.__name__,
exc_message=str(exc_value),
exc_info=(exc_type, exc_value, exc_traceback),
)
# Install global handler
sys.excepthook = global_exception_handler
Async Exception Handler¶
Handle exceptions in async code:
import asyncio
from provide.foundation import logger
def async_exception_handler(loop, context):
"""Handle exceptions in async tasks."""
exception = context.get("exception")
message = context.get("message", "Async exception occurred")
if exception:
logger.exception(
message,
task=context.get("task"),
future=context.get("future"),
)
else:
logger.error(message, context=context)
# Set async exception handler
loop = asyncio.get_event_loop()
loop.set_exception_handler(async_exception_handler)
Production Patterns¶
Exception with Metric Tracking¶
Log exceptions and track metrics:
from provide.foundation.metrics import Counter
# Define metrics
error_counter = Counter("app_errors_total", labels=["error_type", "operation"])
def tracked_operation(operation_name):
"""Operation with error tracking."""
try:
perform_operation()
except Exception as e:
error_type = type(e).__name__
# Track metric
error_counter.increment(labels={
"error_type": error_type,
"operation": operation_name,
})
# Log exception
logger.exception(
"Tracked operation failed",
operation=operation_name,
error_type=error_type,
)
raise
Exception with Alerting¶
Trigger alerts for critical errors:
def critical_operation():
"""Operation where failures trigger alerts."""
try:
result = perform_critical_task()
except Exception as e:
# Log with high severity
logger.critical(
"Critical operation failed - ALERT",
operation="critical_task",
severity="high",
alert_team=True, # Signal to monitoring system
error_type=type(e).__name__,
)
# Send immediate notification
send_pagerduty_alert(
f"Critical operation failed: {type(e).__name__}",
details=str(e),
)
raise
Exception Sanitization¶
Sanitize sensitive data before logging:
from provide.foundation.security import mask_secrets
def safe_exception_logging(user_data):
"""Log exceptions without exposing sensitive data."""
try:
process_user(user_data)
except Exception:
# Sanitize data before logging
safe_data = {
"user_id": user_data.get("user_id"),
"email": mask_email(user_data.get("email")),
"account_type": user_data.get("account_type"),
# Exclude password, API keys, etc.
}
logger.exception(
"User processing failed",
user_data=safe_data,
)
raise
def mask_email(email):
"""Mask email for logging."""
if not email or "@" not in email:
return "***"
username, domain = email.split("@")
return f"{username[0]}***@{domain}"
Best Practices¶
✅ DO: Always Preserve Stack Traces¶
# ✅ Good: Preserves full stack trace
try:
operation()
except Exception:
logger.exception("Operation failed")
raise
# ❌ Bad: Loses stack trace
try:
operation()
except Exception as e:
logger.error(f"Operation failed: {e}") # No traceback!
raise
✅ DO: Add Structured Context¶
# ✅ Good: Rich structured context
try:
process_order(order)
except Exception:
logger.exception(
"Order processing failed",
order_id=order.id,
customer_id=order.customer_id,
amount=order.total,
)
# ❌ Bad: String concatenation loses structure
try:
process_order(order)
except Exception:
logger.exception(
f"Order {order.id} for customer {order.customer_id} failed"
)
✅ DO: Use Appropriate Log Levels¶
# ✅ Good: Appropriate severity levels
try:
optional_operation()
except Exception:
logger.warning("Optional operation failed") # Not critical
try:
critical_operation()
except Exception:
logger.critical("Critical operation failed") # Needs immediate attention
❌ DON'T: Log the Same Exception Multiple Times¶
# ❌ Bad: Logs exception at every layer
def layer1():
try:
layer2()
except Exception:
logger.exception("Layer 1 failed")
raise
def layer2():
try:
operation()
except Exception:
logger.exception("Layer 2 failed") # Duplicate!
raise
# ✅ Good: Log once at the appropriate layer
def layer1():
try:
layer2()
except Exception:
logger.exception("Operation failed", layer="layer1")
raise
def layer2():
operation() # Let exception propagate
❌ DON'T: Swallow Exceptions Silently¶
# ❌ Bad: Silent failure
try:
important_operation()
except Exception:
pass # Lost forever!
# ✅ Good: At minimum, log it
try:
important_operation()
except Exception:
logger.exception("Operation failed but continuing")
# Explicitly choosing to continue
Next Steps¶
Related Guides¶
- Basic Logging: Core logging patterns
- Structured Events: Event-driven logging
Error Handling & Resilience¶
- Retry Patterns: Automatically retry failed operations
- Circuit Breakers: Prevent cascading failures
- Production Monitoring: Production-ready error handling
Examples¶
- See
examples/telemetry/05_exception_handling.pyfor comprehensive exception logging examples - See
examples/production/02_error_handling.pyfor production error patterns
Tip: Always log exceptions with logger.exception() or exc_info=True to preserve stack traces. Add structured context fields to make errors searchable and debuggable.