E2E Testing Strategy: Git Chat React App

Date: November 2025 Project: Git Chat - React + TypeScript Frontend Location: /home/ubuntu/workplace/AhaiaApp/ide/web-react

Overview

This document outlines the essential E2E testing strategy for the Git Chat React application, prioritized by risk and user impact. These tests verify the full stack: React frontend → Socket.io → Node.js backend → command execution.

Top 5 Essential E2E Test Scenarios

1. Happy Path: Quick Command Execution

Priority: 🔴 Critical Phase: 1 (MVP) Expected Duration: ~8 seconds

test('User can execute quick command and see result', async ({ page }) => {
  // Visit app
  // Click first repository
  // Type "echo 'Hello World'"
  // Press Enter
  // Verify: User message appears (blue bubble, right side)
  // Wait 1.5s
  // Verify: Response appears as simple gray bubble (not job UI)
  // Verify: Output contains "Hello World"
})

What it tests:

✅ Repository list loads from backend API
✅ Navigation from contacts → chat
✅ User input and submission
✅ Socket.io connection and command sending
✅ Backend command execution
✅ Message rendering (user + system)
✅ Smart rendering: quick commands show as simple bubbles

Why critical: This is the most common user flow. If this breaks, app is unusable.

2. Long Command with Job UI

Priority: 🔴 Critical Phase: 1 (MVP) Expected Duration: ~10 seconds

test('Long-running command shows job UI with live output', async ({ page }) => {
  // Visit app → Select repo
  // Type "sleep 3 && echo 'Done'"
  // Press Enter
  // Verify: Typing indicator appears
  // Wait 1.5s
  // Verify: Job UI appears (with cancel button, running status)
  // Verify: Output updates in real-time
  // Wait for completion
  // Verify: Success footer with duration
  // Verify: Cancel button removed
})

What it tests:

✅ Smart rendering: >1s commands upgrade to job UI
✅ Real-time Socket.io output streaming
✅ Job UI components (header, output, footer)
✅ Status transitions (running → completed)
✅ Duration calculation and display
✅ UI cleanup after completion

Why critical: Tests the core differentiator - smart message rendering logic and real-time streaming.

3. Job Cancellation

Priority: 🟡 High Phase: 2 (Production) Expected Duration: ~6 seconds

test('User can cancel running job', async ({ page }) => {
  // Select repo
  // Type "sleep 10"
  // Press Enter
  // Wait for job UI to appear
  // Click cancel button (X)
  // Verify: Job shows "Cancelled by user"
  // Verify: Cancel button disappears
  // Verify: No more output updates
})

What it tests:

✅ Socket.io bidirectional communication (client → server cancel)
✅ Backend process termination
✅ UI state updates on cancellation
✅ Proper cleanup (no memory leaks)

Why critical: Tests Socket.io bidirectional communication and user control.

Priority: 🔴 Critical Phase: 1 (MVP) Expected Duration: ~10 seconds

test('User can switch between repositories', async ({ page }) => {
  // Load contacts page
  // Verify: Repository list loads
  // Click first repo
  // Verify: Chat page loads with correct repo name in header
  // Send command "pwd"
  // Verify: Output shows correct repo path
  // Click back button
  // Verify: Returns to contacts page
  // Click different repo
  // Verify: Chat context switched (different repo name)
  // Verify: Previous messages cleared
})

What it tests:

✅ React Router navigation
✅ Context API repository switching
✅ Message state isolation between repos
✅ Backend executes commands in correct repo
✅ UI updates reflect correct repository

Why critical: Tests routing, context switching, and state isolation. Multi-repo support is core functionality.

5. Error Handling and Recovery

Priority: 🟡 High Phase: 2 (Production) Expected Duration: ~8 seconds

test('App handles command errors gracefully', async ({ page }) => {
  // Select repo
  // Type invalid command "nonexistentcommand12345"
  // Press Enter
  // Wait for completion
  // Verify: Error message shown
  // Verify: Exit code displayed (likely 127)
  // Verify: App still responsive
  // Send another valid command "echo 'recovery'"
  // Verify: App continues working normally
})

What it tests:

✅ Failed command handling
✅ Exit code display
✅ App resilience (doesn't crash)
✅ Recovery after errors
✅ Error message clarity

Why critical: Ensures one error doesn't crash the app. Tests production resilience.

Coverage Map

User Flows Covered:

✅ View repository list
✅ Navigate to chat
✅ Send commands
✅ View results (both simple and job UI)
✅ Cancel jobs
✅ Switch repositories
✅ Handle errors

Technical Areas Covered:

✅ Socket.io connection and communication
✅ Backend command execution
✅ Smart rendering logic (<1s vs >1s)
✅ State management (Context API)
✅ React Router navigation
✅ Message rendering (user, job, system)
✅ Real-time output streaming
✅ Error handling
✅ UI interactions (buttons, input, navigation)

Risk Areas Covered:

Backend connectivity - All tests require working Socket.io
Message rendering - Tests both simple bubbles and job UI
Timing logic - Tests the 1-second upgrade threshold
State management - Tests context switches and message isolation
Error resilience - Ensures app doesn't crash on errors

Additional Scenarios (Not in Top 5)

Phase 3 (Polish):

Dark mode toggle
Mobile responsiveness (viewport config)
Auto-scroll behavior
Connection loss/reconnection
Multiple rapid commands
Textarea auto-resize
Very long output (>10,000 lines)
Special characters in commands
PWA installation flow

Implementation Phases

Phase 1 (MVP) - Tests #1, #2, #4

Goal: Prove basic functionality works Duration: ~28 seconds Coverage: Core user journeys

Deliverables:

✅ User can execute quick commands
✅ Job UI works for long commands
✅ Repository navigation works

Phase 2 (Production) - Tests #3, #5

Goal: Prove robustness and error handling Duration: +14 seconds (total: 42 seconds) Coverage: Edge cases and resilience

Deliverables:

✅ Job cancellation works
✅ Errors handled gracefully

Phase 3 (Polish) - Additional scenarios

Goal: Cover edge cases and polish Duration: +30 seconds (total: 72 seconds) Coverage: Nice-to-haves and edge cases

Test Structure

// e2e/critical-flows.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Critical User Journeys', () => {
  test.beforeEach(async ({ page }) => {
    // Ensure backend is running on port 8000
    // Frontend on port 3015
    await page.goto('http://localhost:3015');
    await expect(page.locator('h1')).toContainText('Chats');
  });

  test('1. Quick command execution', async ({ page }) => { /*...*/ });
  test('2. Long command with job UI', async ({ page }) => { /*...*/ });
  test('3. Job cancellation', async ({ page }) => { /*...*/ });
  test('4. Repository navigation', async ({ page }) => { /*...*/ });
  test('5. Error handling', async ({ page }) => { /*...*/ });
});

Expected Test Times

Following the terminal-only development philosophy:

Each test: < 15 seconds
Phase 1 suite: ~28 seconds
Phase 2 suite: ~42 seconds
Full suite: < 90 seconds (E2E threshold)

This ensures fast feedback loops even for E2E tests.

Test Environment Requirements

Prerequisites:

Backend server running on port 8000
Frontend dev server on port 3015 (or production build)
At least 2 repositories in ~/workplace for navigation tests
Playwright installed and configured

Setup Commands:

# Terminal 1: Backend
cd /home/ubuntu/workplace/AhaiaApp/ide/web
node server.js > /tmp/backend.log 2>&1 &

# Terminal 2: Frontend
cd /home/ubuntu/workplace/AhaiaApp/ide/web-react
npm run dev > /tmp/frontend.log 2>&1 &

# Terminal 3: Run tests
npm run test:e2e

Cleanup:

# Kill servers after tests
fuser -k 8000/tcp  # Backend
fuser -k 3015/tcp  # Frontend

Playwright Configuration

// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './e2e',
  timeout: 30000, // 30s per test
  expect: {
    timeout: 5000, // 5s for assertions
  },
  fullyParallel: false, // Run sequentially (shared state)
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: 1, // Single worker (shared backend)
  reporter: 'list',
  use: {
    baseURL: 'http://localhost:3015',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
  },
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    },
  ],
  webServer: [
    {
      command: 'cd ../web && node server.js',
      port: 8000,
      timeout: 10000,
      reuseExistingServer: true,
    },
    {
      command: 'npm run dev',
      port: 3015,
      timeout: 10000,
      reuseExistingServer: true,
    },
  ],
});

Test Selectors Best Practices

Use data-testid for stability:

// Component
<button data-testid="send-button">Send</button>

// Test
await page.getByTestId('send-button').click();

Avoid brittle selectors:

❌ .message-bubble:nth-child(2) - Breaks if UI changes ✅ page.getByRole('button', { name: 'Send' }) - Semantic ✅ page.getByTestId('job-output') - Explicit

Debugging E2E Tests

View test output:

npm run test:e2e > /tmp/e2e.log 2>&1
cat /tmp/e2e.log

Run in headed mode (requires X11 forwarding):

npm run test:e2e -- --headed

Run single test:

npx playwright test --grep "Quick command execution"

Debug with trace viewer:

npx playwright show-trace trace.zip

Check server logs during test:

tail -f /tmp/backend.log
tail -f /tmp/frontend.log

Success Metrics

You'll know E2E tests are working if:

✅ All Phase 1 tests pass in < 30 seconds
✅ Tests catch regressions before manual testing
✅ Can run tests in CI/CD pipeline
✅ No flaky tests (>95% pass rate)
✅ Tests fail with clear error messages

Red flags:

❌ Tests take > 90 seconds total
❌ Flaky tests (random failures)
❌ Tests pass but app is broken (false positives)
❌ Tests break on every UI change

Maintenance

When to update tests:

API contract changes
Major UI refactors
New critical features added
Selectors become brittle

Test review checklist:

Tests are deterministic (no random failures)
Tests are independent (can run in any order)
Tests clean up after themselves
Tests have clear failure messages
Tests are fast (< 15s each)

Integration with CI/CD

Pre-commit hook (optional):

# .git/hooks/pre-commit
npm run test:e2e:quick  # Phase 1 tests only

Pre-deploy check:

npm run test:e2e  # All tests

GitHub Actions example:

- name: Run E2E Tests
  run: |
    npm run test:e2e
  env:
    CI: true

/home/ubuntu/yap/note-to-next-agent.md - General development philosophy
/home/ubuntu/workplace/AhaiaApp/ide/web-react/README.md - Project architecture
/home/ubuntu/workplace/AhaiaApp/ide/web-react/GETTING_STARTED.md - Quick start guide
/home/ubuntu/workplace/AhaiaApp/music/docs/webapp-testing-design-doc.md - Testing strategy template

Notes for Future Developers

Always test against real backend - Mocking Socket.io in E2E defeats the purpose
Keep tests fast - Slow tests won't be run
Use stable selectors - data-testid or semantic roles
Test user journeys, not implementation - Test what users do, not how it works
Redirect output to /tmp - Don't clutter workspace with test logs
One assertion per test is a myth - Test complete user journeys

Quick Reference Commands

# Setup
npm install -D @playwright/test
npx playwright install

# Run all E2E tests
npm run test:e2e

# Run Phase 1 only
npm run test:e2e -- --grep "Phase 1"

# Run specific test
npx playwright test --grep "Quick command"

# Debug mode
npx playwright test --debug

# Update snapshots
npx playwright test --update-snapshots

# View test report
npx playwright show-report

Written for: Git Chat React refactoring project Last Updated: November 2025 Status: Strategy defined, implementation pending

E2E Testing Strategy: Git Chat React App

E2E Testing Strategy: Git Chat React App

Overview

Top 5 Essential E2E Test Scenarios

1. Happy Path: Quick Command Execution

2. Long Command with Job UI

3. Job Cancellation

4. Repository Navigation

5. Error Handling and Recovery

Coverage Map

User Flows Covered:

Technical Areas Covered:

Risk Areas Covered:

Additional Scenarios (Not in Top 5)

Implementation Phases

Phase 1 (MVP) - Tests #1, #2, #4

Phase 2 (Production) - Tests #3, #5

Phase 3 (Polish) - Additional scenarios

Test Structure

Expected Test Times

Test Environment Requirements

Prerequisites:

Setup Commands:

Cleanup:

Playwright Configuration

Test Selectors Best Practices

Use data-testid for stability:

Avoid brittle selectors:

Debugging E2E Tests

View test output:

Run in headed mode (requires X11 forwarding):

Run single test:

Debug with trace viewer:

Check server logs during test:

Success Metrics

Maintenance

When to update tests:

Test review checklist:

Integration with CI/CD

Pre-commit hook (optional):

Pre-deploy check:

GitHub Actions example:

Related Documentation

Notes for Future Developers

Quick Reference Commands