Implementation Guide¶
This document provides a step-by-step guide for implementing the 15 CI/CD improvements to soup stir.
Table of Contents¶
- Implementation Phases
- Phase 1: Quick Wins
- Phase 2: Core CI Features
- Phase 3: Enhanced UX
- Phase 4: Advanced Features
- Testing Strategy
- Rollout Plan
Implementation Phases¶
The improvements are organized into 4 phases based on dependencies, complexity, and priority.
Dependency Graph¶
Phase 1 (Foundation):
#1 CI Detection ─┬──▶ #4 Format Flag ──▶ #3 JUnit XML
│
├──▶ #7 Timestamps
│
└──▶ #14 Color Control
#6 Parallelism ───────▶ (Independent)
#2 JSON Output ───────▶ (Independent)
Phase 2 (Build on Foundation):
#4 Format Flag ───────▶ #3 JUnit XML
└───▶ (GitHub format, quiet)
#5 Timeouts ──────────▶ (Executor changes)
#8 Error Fields ──────▶ #2, #3 (Enhances JSON/XML)
#10 Summary File ─────▶ (Independent)
Phase 3 (Enhancements):
#11 Phase Timing ─────▶ #2, #3, #10 (Adds to outputs)
#12 Progress % ───────▶ #4 (Plain format enhancement)
#15 Fail-fast ────────▶ (Independent)
Phase 4 (Advanced):
#9 Log Aggregation ───▶ (Complex, depends on stability)
#13 Refresh Rate ─────▶ #4 (Table format option)
Phase 1: Quick Wins (1-2 days)¶
Goal: Foundational improvements that provide immediate value.
#1: CI Detection & Auto-Adaptation¶
Estimated Time: 3-4 hours
Files to Create:
- src/tofusoup/stir/detection.py (new)
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/display.py
Steps:
-
Create detection module (
detection.py): -
Update CLI (
cli.py): - Add
--ci/--no-ciflags - Call
detect_display_mode()early in run -
Pass mode to display initialization
-
Update display (
display.py): - Add conditional logic for CI mode
- Implement line-by-line output as alternative to Live table
- Keep Live table as default for interactive
Testing:
- Unit tests for is_ci_environment() with mocked env vars
- Unit tests for is_tty() with mocked stdout
- Integration test in actual CI (GitHub Actions)
- Manual test in TTY and non-TTY
Acceptance Criteria:
- ✅ CI environments auto-detected
- ✅ Line-by-line output in CI
- ✅ --ci forces CI mode
- ✅ --no-ci forces interactive mode
#7: Timestamps¶
Estimated Time: 2-3 hours
Dependencies: #1 (CI Detection)
Files to Modify:
- src/tofusoup/stir/display.py
- src/tofusoup/stir/cli.py
Steps:
- Add CLI flags:
--timestamps/--no-timestamps-
--timestamp-format=iso8601|relative|unix -
Implement timestamp generation:
-
Integrate into display:
- Auto-enable in CI mode
- Prefix each line in line-by-line output
- Don't show in table mode (uses Elapsed column instead)
Testing:
- Test all timestamp formats
- Verify auto-enable in CI
- Verify disable with --no-timestamps
Acceptance Criteria: - ✅ Timestamps auto-enabled in CI - ✅ All formats work correctly - ✅ Flags control behavior
#14: Color Control¶
Estimated Time: 1-2 hours
Dependencies: #1 (CI Detection)
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/display.py
Steps:
- Add CLI flag:
--color=auto|always|never-
--no-color(shorthand) -
Check environment variables:
NO_COLORFORCE_COLOR-
SOUP_STIR_COLOR -
Configure Rich console:
Testing:
- Test with NO_COLOR=1
- Test with --color=never
- Test with --color=always in non-TTY
- Visual verification of colors
Acceptance Criteria:
- ✅ --color controls output
- ✅ NO_COLOR env var respected
- ✅ Auto-detection works
#6: Parallelism Control¶
Estimated Time: 2-3 hours
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/executor.py
- src/tofusoup/stir/config.py
Steps:
- Add CLI flag:
--jobs=N/-j N- Handle
N=0orautoas current behavior -
Handle
N=1for serial -
Update executor:
-
Display parallelism:
- Show at start: "Running N tests with parallelism=M..."
Testing:
- Test with -j 1 (serial)
- Test with -j 2
- Test with --jobs=auto
- Verify deterministic order with -j 1
Acceptance Criteria:
- ✅ -j N limits parallelism
- ✅ -j 1 runs serially
- ✅ Auto-detection works
- ✅ Parallelism displayed at start
#2: JSON Output¶
Estimated Time: 3-4 hours
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/models.py
- src/tofusoup/stir/reporting.py (new functions)
Steps:
- Add CLI flags:
--json-
--json-pretty -
Create JSON builder:
-
Suppress other output when
--json: - No live display
- No summary panel
- Only JSON to stdout
-
Errors to stderr
-
Output JSON:
Testing:
- Test JSON validity (json.loads(output))
- Test with jq
- Verify schema compliance
- Test error handling (errors go to stderr)
Acceptance Criteria:
- ✅ --json outputs valid JSON
- ✅ Schema matches specification
- ✅ No other output on stdout
- ✅ Pretty-print option works
Phase 2: Core CI Features (3-5 days)¶
Goal: Essential CI/CD integrations
#4: Format Flag (Renderer System)¶
Estimated Time: 1 day
Dependencies: #1 (CI Detection)
Files to Create:
- src/tofusoup/stir/renderers/base.py
- src/tofusoup/stir/renderers/table.py
- src/tofusoup/stir/renderers/plain.py
- src/tofusoup/stir/renderers/json.py
- src/tofusoup/stir/renderers/github.py
- src/tofusoup/stir/renderers/quiet.py
- src/tofusoup/stir/renderers/__init__.py
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/executor.py
- src/tofusoup/stir/display.py (refactor into table.py)
Steps:
-
Create base renderer interface:
-
Implement renderers:
TableRenderer: Refactor existing Live table codePlainRenderer: Line-by-line from #1JSONRenderer: JSON output from #2GitHubRenderer: GitHub Actions annotations-
QuietRenderer: Minimal output -
Create registry:
-
Integrate into CLI:
Testing: - Unit test each renderer independently - Integration tests for each format - Verify format switching works - Test in CI for GitHub format
Acceptance Criteria:
- ✅ All 5 formats work
- ✅ Renderer system is extensible
- ✅ --format controls output
- ✅ Auto-detection uses appropriate format
#3: JUnit XML Output¶
Estimated Time: 4-6 hours
Dependencies: #4 (Renderer system provides structure)
Files to Create:
- src/tofusoup/stir/junit.py (new module for XML generation)
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/reporting.py
Steps:
- Add CLI flag:
--junit-xml=FILE-
--junit-suite-name=NAME -
Create XML builder:
def build_junit_xml(results: list[TestResult], ...) -> str: # Use xml.etree.ElementTree or lxml testsuites = ET.Element("testsuites", ...) testsuite = ET.SubElement(testsuites, "testsuite", ...) for result in results: testcase = ET.SubElement(testsuite, "testcase", ...) if result.failed_stage: failure = ET.SubElement(testcase, "failure", ...) return ET.tostring(testsuites, encoding="unicode") -
Write to file:
-
Map fields correctly:
- Use proper failure types
- Include all metadata in system-out
- Put errors in system-err
Testing: - Validate XML with XSD schema - Test in Jenkins - Test in GitHub Actions (upload-artifact + test reporter) - Test in GitLab CI - Verify file creation with missing dirs
Acceptance Criteria: - ✅ Valid JUnit XML generated - ✅ Works in major CI systems - ✅ All test states represented correctly - ✅ Parent dirs created automatically
#5: Timeout Controls¶
Estimated Time: 4-6 hours
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/executor.py
Steps:
- Add CLI flags:
--timeout=SECONDS-
--test-timeout=SECONDS -
Wrap test execution:
-
Wrap suite execution:
-
Graceful termination:
- SIGTERM → wait 5s → SIGKILL
-
Mark as TIMEOUT status
-
Update exit codes:
- 124 for global timeout
- 125 for test timeout
Testing:
- Test with actual long-running terraform
- Test with sleep commands
- Verify graceful termination
- Verify exit codes
- Test timeout in JSON/XML output
Acceptance Criteria: - ✅ Per-test timeout works - ✅ Global timeout works - ✅ Graceful termination attempted - ✅ Exit codes correct - ✅ Timeout status in outputs
#8: Populate Error Fields¶
Estimated Time: 2-3 hours
Dependencies: None (enhances #2, #3)
Files to Modify:
- src/tofusoup/stir/executor.py
- src/tofusoup/stir/models.py (if needed)
Steps:
-
Track failed stage:
-
Extract error message:
-
Handle exceptions:
Testing: - Test with failing init - Test with failing apply - Test with Python exception - Verify fields populated in JSON - Verify used in JUnit XML
Acceptance Criteria:
- ✅ failed_stage populated for all failures
- ✅ error_message contains first error
- ✅ Harness exceptions marked correctly
#10: Summary File Output¶
Estimated Time: 2-3 hours
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/reporting.py
Steps:
- Add CLI flags:
--summary-file=FILE-
--summary-format=json|text|markdown -
Implement formatters:
-
Write to file:
Testing: - Test all 3 formats - Verify file creation - Test markdown rendering (GitHub/GitLab)
Acceptance Criteria: - ✅ All formats work - ✅ Files created correctly - ✅ Markdown renders properly
Phase 3: Enhanced UX (3-4 days)¶
#11: Per-Phase Timing¶
Estimated Time: 4-5 hours
Files to Modify:
- src/tofusoup/stir/executor.py
- src/tofusoup/stir/models.py
- src/tofusoup/stir/display.py
Steps:
-
Track phase timestamps:
-
Add to TestResult:
-
Add CLI flag:
-
--show-phase-timing -
Display in output:
- Terminal: breakdown after test completes
- JSON: include in test object
- JUnit: in system-out
Testing: - Verify timing accuracy - Test display flag - Verify in JSON output
Acceptance Criteria: - ✅ Phase timings tracked - ✅ Display flag works - ✅ Included in JSON/XML
#12: Progress Percentage¶
Estimated Time: 2-3 hours
Dependencies: #4 (Renderer system)
Files to Modify:
- src/tofusoup/stir/renderers/plain.py
- src/tofusoup/stir/cli.py
Steps:
- Add CLI flags:
--show-progress-
--show-eta -
Calculate progress:
-
Calculate ETA:
-
Display:
Testing: - Test percentage calculation - Test ETA estimation - Verify auto-enable in CI
Acceptance Criteria: - ✅ Progress % displayed - ✅ ETA estimation works - ✅ Auto-enabled in CI
#15: Fail-Fast Mode¶
Estimated Time: 2-3 hours
Files to Modify:
- src/tofusoup/stir/executor.py
- src/tofusoup/stir/cli.py
Steps:
- Add CLI flags:
--fail-fast-
--fail-threshold=N -
Check after each test:
-
Mark skipped tests:
- Tests not yet started marked as skipped
- Include skip reason in output
Testing:
- Test with --fail-fast
- Test with --fail-threshold=2
- Verify running tests complete
- Verify exit code still 1
Acceptance Criteria: - ✅ Stops after first failure (fail-fast) - ✅ Stops after N failures (threshold) - ✅ Pending tests skipped - ✅ Exit code correct
Phase 4: Advanced Features (5-7 days)¶
#9: Log Aggregation & Streaming¶
Estimated Time: 1-2 days
Files to Modify:
- src/tofusoup/stir/terraform.py
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/executor.py
Steps:
- Add CLI flags:
--stream-logs--aggregate-logs=FILE-
--logs-dir=DIR -
Stream logs:
-
Aggregate logs:
-
Custom logs directory:
Testing: - Test log streaming with parallel tests - Test log aggregation - Verify custom logs dir
Acceptance Criteria: - ✅ Logs stream to stdout - ✅ Logs aggregate to file - ✅ Custom dir works
#13: Configurable Refresh Rate¶
Estimated Time: 1-2 hours
Files to Modify:
- src/tofusoup/stir/cli.py
- src/tofusoup/stir/display.py
Steps:
- Add CLI flags:
--refresh-rate=RATE-
--no-refresh -
Update Live display:
-
Implement no-refresh:
- Only update on status change
- Use event-based updates instead of polling
Testing:
- Test different refresh rates
- Test --no-refresh
- Verify performance
Acceptance Criteria: - ✅ Refresh rate configurable - ✅ No-refresh mode works - ✅ Auto-adjust in CI
Testing Strategy¶
Unit Tests¶
Create unit tests for each new module:
# tests/stir/test_detection.py
def test_is_ci_environment_github():
with mock.patch.dict(os.environ, {"GITHUB_ACTIONS": "true"}):
assert is_ci_environment() is True
def test_is_ci_environment_none():
with mock.patch.dict(os.environ, {}, clear=True):
assert is_ci_environment() is False
# tests/stir/test_renderers.py
def test_json_renderer_valid_json():
renderer = JSONRenderer(console, config={})
renderer.start(3)
# ... execute ...
output = renderer.complete(results)
data = json.loads(output)
assert "summary" in data
assert "tests" in data
Integration Tests¶
Test complete workflows:
# tests/stir/test_integration.py
def test_json_output_integration(tmp_path):
"""Test complete run with JSON output"""
result = subprocess.run(
["soup", "stir", "--json", "tests/fixtures"],
capture_output=True,
text=True
)
assert result.returncode == 0
data = json.loads(result.stdout)
assert data["summary"]["total"] > 0
def test_junit_xml_output(tmp_path):
"""Test complete run with JUnit XML"""
xml_file = tmp_path / "results.xml"
subprocess.run(
["soup", "stir", f"--junit-xml={xml_file}", "tests/fixtures"],
check=True
)
assert xml_file.exists()
tree = ET.parse(xml_file)
root = tree.getroot()
assert root.tag == "testsuites"
CI Testing¶
Test in actual CI environments:
# .github/workflows/test-soup-stir.yml
name: Test soup stir CI features
on: [push, pull_request]
jobs:
test-ci-detection:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
- name: Install
run: pip install -e .
- name: Test CI detection
run: |
# Should auto-use plain format in CI
soup stir tests/fixtures
- name: Test JSON output
run: |
soup stir --json tests/fixtures > results.json
jq . results.json # Validate JSON
- name: Test JUnit XML
run: |
soup stir --junit-xml=results.xml tests/fixtures
- name: Upload test results
if: always()
uses: actions/upload-artifact@v3
with:
name: test-results
path: results.xml
Manual Testing Checklist¶
Before each release:
- Test in interactive terminal (table format)
- Test in non-TTY (plain format)
- Test in GitHub Actions
- Test in GitLab CI
- Test with
--jsonand pipe tojq - Test with
--junit-xmland upload to Jenkins - Test all timeout scenarios
- Test with
--jobs=1(serial) - Test with
--fail-fast - Test with
--no-color
Rollout Plan¶
Alpha Release (Internal Testing)¶
Target: Development team only
Features: - Phase 1 complete (#1, #2, #6, #7, #14) - Basic JSON output - CI detection
Testing: - Internal CI pipelines - Developer machines - Collect feedback
Duration: 1 week
Beta Release (Early Adopters)¶
Target: Early adopters, selected users
Features: - Phase 1 + Phase 2 complete - All output formats - JUnit XML - Timeouts
Testing: - Real-world CI environments - Various test suites - Performance testing
Duration: 2 weeks
RC (Release Candidate)¶
Target: All users (opt-in)
Features: - Phase 1-3 complete - All enhancements - Documentation complete
Testing: - Wide deployment - Monitor for issues - Gather metrics
Duration: 1 week
Stable Release¶
Target: All users (default)
Features: - All phases complete - Fully documented - Thoroughly tested
Rollout: - Announce in CHANGELOG - Update documentation - Blog post / release notes
File Modification Summary¶
New Files¶
src/tofusoup/stir/
├── detection.py # CI detection logic
├── junit.py # JUnit XML generation
└── renderers/
├── __init__.py
├── base.py
├── table.py
├── plain.py
├── json.py
├── github.py
└── quiet.py
Modified Files¶
src/tofusoup/stir/
├── cli.py # All new CLI flags
├── config.py # New constants
├── display.py # Refactor into renderers
├── executor.py # Timeouts, fail-fast, phase timing
├── models.py # New fields in TestResult
├── reporting.py # Summary file, JUnit XML
└── terraform.py # Log streaming
Test Files¶
tests/stir/
├── test_detection.py # NEW
├── test_renderers.py # NEW
├── test_junit.py # NEW
├── test_timeouts.py # NEW
├── test_integration.py # Modified
└── fixtures/ # Test fixtures
├── passing-test/
├── failing-test/
├── timeout-test/
└── empty-test/
Migration Checklist¶
For users upgrading from previous versions:
- No breaking changes - existing commands work as-is
- New flags are opt-in
- CI detection improves logs automatically
- Documentation updated
- CHANGELOG includes all changes
- Migration guide (if needed)
- Deprecation notices (none currently)
Document Version: 1.0.0 Last Updated: 2025-11-02 Status: Draft Implementation Guide