How to Use GitHub Copilot for Test Generation: Improving Code Coverage with AI-Assisted Testing
Why Test Generation Is Copilot’s Most Practical Use Case
Most developers agree that testing is important. Most developers also agree they do not write enough tests. The gap between intention and practice exists because writing tests is tedious — especially for existing code where you need to understand the behavior, set up mocks, handle edge cases, and verify assertions.
GitHub Copilot bridges this gap. It reads your implementation code and generates corresponding tests — correctly handling the boilerplate (imports, setup, teardown) while suggesting meaningful test cases that cover the common patterns, edge cases, and error conditions.
For most codebases, Copilot can generate a first draft of tests that covers 60-80% of the necessary test cases. The developer’s job shifts from writing tests to reviewing and improving AI-generated tests — a significantly faster workflow.
This guide covers the practical techniques for using Copilot to improve your test coverage efficiently.
Step 1: Identify Coverage Gaps
Running Coverage Reports
Before generating tests, know what is untested:
# JavaScript/TypeScript (Jest) npx jest --coverage # Python (pytest) pytest --cov=src --cov-report=html # Go go test -cover ./...
Prioritizing What to Test
Not all coverage gaps are equal. Prioritize:
High priority (test first):
- Business logic (calculations, rules, validations)
- Error handling paths (what happens when things fail)
- Authentication and authorization
- Data transformations (input/output mapping)
- Public API endpoints
Medium priority:
- Utility functions
- Configuration loading
- Data access layer
- Middleware and interceptors
Low priority (test last or skip):
- Simple getters/setters with no logic
- Framework-generated boilerplate
- Type definitions
- Constants
Creating the Test Plan
Coverage gaps identified: 1. src/services/OrderService.ts — 12% coverage Missing: createOrder, calculateTotal, applyDiscount, validateInventory 2. src/middleware/auth.ts — 0% coverage Missing: all functions 3. src/utils/validation.ts — 45% coverage Missing: edge cases for email, phone, address validators 4. src/api/routes/orders.ts — 30% coverage Missing: error handling paths, pagination edge cases
Step 2: Generate Unit Tests with Copilot
Method 1: Copilot Chat (/tests Command)
Open the file you want to test and use Copilot Chat:
/tests Generate unit tests for this file
Copilot generates a test file with:
- Proper imports and test framework setup
- Test suites organized by function
- Happy path tests for each public function
- Basic error case tests
Method 2: Contextual Prompt in Chat
For more control, provide specific instructions:
"Generate comprehensive unit tests for the OrderService class in src/services/OrderService.ts. Cover: 1. createOrder: valid order, missing fields, invalid product ID 2. calculateTotal: normal items, discounted items, empty cart 3. applyDiscount: valid code, expired code, minimum order not met 4. validateInventory: all in stock, partial stock, out of stock Use Jest with TypeScript. Mock the database layer using the existing pattern in src/services/__tests__/UserService.test.ts. Use the test fixtures in src/__tests__/fixtures/."
Method 3: Inline Generation
Open an empty test file and write the describe block:
describe('OrderService', () => {
describe('createOrder', () => {
// Copilot starts suggesting test cases here
Copilot’s inline suggestions are context-aware — it reads the implementation file and generates relevant test cases as you type.
Method 4: Generate from Copilot Chat with File Reference
"@workspace Generate tests for #file:src/services/OrderService.ts Cover all public methods with happy path and error cases. Use the testing patterns from #file:src/services/__tests__/UserService.test.ts"
The @workspace and #file references give Copilot direct access to the implementation and existing test patterns.
Step 3: Add Edge Case Tests
Asking Copilot for Edge Cases
After generating basic tests, ask specifically for edge cases:
"What edge cases are not covered in these tests for the calculateTotal function? Consider: - Boundary values (zero, negative, very large numbers) - Type edge cases (null, undefined, NaN, empty string) - Collection edge cases (empty array, single item, 1000 items) - Concurrency (what if called simultaneously) - Precision (floating point math for currency)"
Copilot typically identifies 5-10 additional edge cases per function.
Common Edge Case Categories
For each function, systematically check:
| Category | Edge Cases | Example |
|---|---|---|
| Null/undefined | null input, undefined fields | createOrder(null) |
| Empty | empty string, empty array, empty object | calculateTotal([]) |
| Boundary | zero, max int, min int | applyDiscount(0) |
| Type mismatch | string where number expected | calculateTotal("abc") |
| Special characters | unicode, SQL injection strings | search("'; DROP TABLE--") |
| Async timing | timeout, concurrent calls | Two orders for last item |
| Precision | floating point, rounding | 0.1 + 0.2 !== 0.3 |
Generating Boundary Value Tests
"Generate boundary value tests for the validateAge function that accepts an integer age parameter. The valid range is 0-150. Generate tests for: -1, 0, 1, 74, 75, 149, 150, 151, null, undefined, NaN, Infinity, 3.5, and the string '25'."
Step 4: Generate Integration Tests
API Endpoint Tests
"Generate integration tests for the POST /api/orders endpoint. Use supertest with the existing app instance from src/app.ts. Test scenarios: 1. Successful order creation (201) 2. Missing required fields (400 with field-level errors) 3. Invalid product ID (404) 4. Insufficient inventory (409) 5. Unauthenticated request (401) 6. Unauthorized user role (403) 7. Rate limited request (429) 8. Database error (500) For each test: - Set up necessary test data (products, user) - Make the request with appropriate headers - Assert status code - Assert response body structure - Assert side effects (database state, events emitted)"
Database Integration Tests
"Generate integration tests for the OrderRepository. Use the test database configured in src/test/setup.ts. Test: 1. Create order and verify it exists in the database 2. Create order with line items (verify foreign keys) 3. Query orders with pagination 4. Update order status 5. Delete order (verify cascade to line items) 6. Concurrent order creation (verify inventory constraint) Each test should clean up after itself. Use transactions that roll back after each test."
Service Layer Integration Tests
"Generate tests that verify OrderService correctly integrates with PaymentService and InventoryService. Use partial mocks: mock PaymentService.processPayment but use the real InventoryService against the test database. Test the full flow: 1. Order succeeds (inventory decremented, payment processed) 2. Payment fails (inventory NOT decremented — verify rollback) 3. Inventory insufficient (payment NOT attempted) 4. Partial fulfillment (some items available, some not)"
Step 5: Review and Refine Generated Tests
The Test Quality Checklist
Generated tests often have these issues:
Issue 1: Tautological tests (testing nothing)
// BAD: This test always passes
test('should return result', () => {
const result = add(2, 3);
expect(result).toBeDefined(); // Does not verify correctness
});
// GOOD: This test verifies actual behavior
test('should return sum of two numbers', () => {
expect(add(2, 3)).toBe(5);
});
Issue 2: Testing implementation, not behavior
// BAD: Brittle — breaks when implementation changes
test('should call database.query with SELECT', () => {
userService.findById('123');
expect(database.query).toHaveBeenCalledWith(
'SELECT * FROM users WHERE id = $1', ['123']
);
});
// GOOD: Tests behavior, not SQL query text
test('should return user by ID', () => {
const user = await userService.findById('123');
expect(user.id).toBe('123');
expect(user.name).toBeDefined();
});
Issue 3: Missing assertions
// BAD: No assertion — test passes even if function throws
test('should process order', async () => {
await orderService.process(mockOrder);
});
// GOOD: Verify the outcome
test('should process order and update status', async () => {
const result = await orderService.process(mockOrder);
expect(result.status).toBe('processed');
expect(result.processedAt).toBeDefined();
});
Review Workflow
For each generated test file:
- Run the tests: Do they all pass? If not, are the failures due to test bugs or implementation bugs?
- Mutation test: Introduce a deliberate bug in the implementation. Does at least one test fail? If no test catches the bug, the tests are not testing the right things.
- Read each assertion: Is it testing something meaningful? Could the test pass even if the function is broken?
- Check mock setup: Are mocks realistic? Do they return data that matches what the real dependency would return?
- Verify cleanup: Do tests clean up after themselves? Running the suite twice should produce the same results.
Step 6: Establish a Testing Workflow
Pre-Commit: Generate Tests for New Code
Developer workflow: 1. Write the implementation 2. Ask Copilot: "Generate tests for the functions I just added" 3. Review and refine the generated tests 4. Run coverage to verify the new code is tested 5. Commit implementation and tests together
Sprint Planning: Allocate Test Coverage Time
Each sprint: - 10% of sprint capacity allocated to test improvement - Developer selects lowest-coverage service from the report - Uses Copilot to generate tests for that service - Reviews, refines, and merges - Coverage improves incrementally each sprint
Coverage Target Progression
Month 1: 40% → 55% (focus on critical business logic) Month 2: 55% → 65% (add error handling paths) Month 3: 65% → 75% (add edge cases and integration) Month 4: 75% → 80% (fill remaining gaps) Month 5+: Maintain 80%+ (test new code as it is written)
Going from 0% to 80% coverage takes approximately 4-5 months with Copilot assistance (roughly 60% faster than manual test writing).
Measuring Test Quality Beyond Coverage
Coverage Is Necessary but Not Sufficient
100% code coverage with weak assertions is worse than 70% coverage with strong assertions. Track these additional metrics:
| Metric | What It Measures | Target |
|---|---|---|
| Line coverage | Which lines execute during tests | 80%+ |
| Branch coverage | Which if/else branches are tested | 75%+ |
| Mutation score | What % of injected bugs are caught | 70%+ |
| Test-to-code ratio | Lines of test vs. lines of code | 1:1 to 2:1 |
| Test execution time | How long the suite takes | Under 5 min for unit tests |
Mutation Testing
Use mutation testing to verify test quality:
# JavaScript (Stryker) npx stryker run # Python (mutmut) mutmut run # Go (go-mutesting) go-mutesting ./...
Mutation testing changes your code (mutates a + to -, removes a condition, changes a return value) and checks if any test fails. If no test catches the mutation, your tests have a gap.
Frequently Asked Questions
Does Copilot generate tests that actually catch bugs?
Yes, when the tests have meaningful assertions. Copilot generates structurally correct tests, but you must verify that the assertions check real behavior. Review every assertion — accept the structure, validate the content.
Which testing framework does Copilot work best with?
Copilot works well with all major frameworks: Jest, Mocha, Vitest (JS/TS), pytest (Python), Go testing (Go), JUnit (Java), RSpec (Ruby). It adapts to your project’s existing test setup.
How do I handle test generation for private functions?
Test private functions through the public API that calls them. Ask Copilot: “Generate tests for the public methods of this class that exercise the private helper functions through the public interface.”
Can Copilot generate tests for legacy code with no existing tests?
Yes. This is one of Copilot’s strongest use cases. Point it at untested legacy code and ask for characterization tests (tests that document current behavior without judgment about correctness). These become the safety net for future refactoring.
Should I commit AI-generated tests as-is?
Never. Always review, run, and refine before committing. Generated tests are first drafts — they need the same review as generated production code.
How do I test async code with Copilot?
Copilot handles async well if you specify the framework. “Generate tests for this async function using Jest with async/await syntax. Test both the success and rejection paths of the Promise.”