How to Use GitHub Copilot for Test Generation: Improving Code Coverage with AI-Assisted Testing

Why Test Generation Is Copilot’s Most Practical Use Case

Most developers agree that testing is important. Most developers also agree they do not write enough tests. The gap between intention and practice exists because writing tests is tedious — especially for existing code where you need to understand the behavior, set up mocks, handle edge cases, and verify assertions.

GitHub Copilot bridges this gap. It reads your implementation code and generates corresponding tests — correctly handling the boilerplate (imports, setup, teardown) while suggesting meaningful test cases that cover the common patterns, edge cases, and error conditions.

For most codebases, Copilot can generate a first draft of tests that covers 60-80% of the necessary test cases. The developer’s job shifts from writing tests to reviewing and improving AI-generated tests — a significantly faster workflow.

This guide covers the practical techniques for using Copilot to improve your test coverage efficiently.

Step 1: Identify Coverage Gaps

Running Coverage Reports

Before generating tests, know what is untested:

# JavaScript/TypeScript (Jest)
npx jest --coverage

# Python (pytest)
pytest --cov=src --cov-report=html

# Go
go test -cover ./...

Prioritizing What to Test

Not all coverage gaps are equal. Prioritize:

High priority (test first):

Business logic (calculations, rules, validations)
Error handling paths (what happens when things fail)
Authentication and authorization
Data transformations (input/output mapping)
Public API endpoints

Medium priority:

Utility functions
Configuration loading
Data access layer
Middleware and interceptors

Low priority (test last or skip):

Simple getters/setters with no logic
Framework-generated boilerplate
Type definitions
Constants

Creating the Test Plan

Coverage gaps identified:
1. src/services/OrderService.ts — 12% coverage
   Missing: createOrder, calculateTotal, applyDiscount,
   validateInventory
2. src/middleware/auth.ts — 0% coverage
   Missing: all functions
3. src/utils/validation.ts — 45% coverage
   Missing: edge cases for email, phone, address validators
4. src/api/routes/orders.ts — 30% coverage
   Missing: error handling paths, pagination edge cases

Step 2: Generate Unit Tests with Copilot

Method 1: Copilot Chat (/tests Command)

Open the file you want to test and use Copilot Chat:

/tests Generate unit tests for this file

Copilot generates a test file with:

Proper imports and test framework setup
Test suites organized by function
Happy path tests for each public function
Basic error case tests

Method 2: Contextual Prompt in Chat

For more control, provide specific instructions:

"Generate comprehensive unit tests for the OrderService class
in src/services/OrderService.ts. Cover:
1. createOrder: valid order, missing fields, invalid product ID
2. calculateTotal: normal items, discounted items, empty cart
3. applyDiscount: valid code, expired code, minimum order not met
4. validateInventory: all in stock, partial stock, out of stock

Use Jest with TypeScript. Mock the database layer using the
existing pattern in src/services/__tests__/UserService.test.ts.
Use the test fixtures in src/__tests__/fixtures/."

Method 3: Inline Generation

Open an empty test file and write the describe block:

describe('OrderService', () => {
  describe('createOrder', () => {
    // Copilot starts suggesting test cases here

Copilot’s inline suggestions are context-aware — it reads the implementation file and generates relevant test cases as you type.

Method 4: Generate from Copilot Chat with File Reference

"@workspace Generate tests for #file:src/services/OrderService.ts
Cover all public methods with happy path and error cases.
Use the testing patterns from #file:src/services/__tests__/UserService.test.ts"

The @workspace and #file references give Copilot direct access to the implementation and existing test patterns.

Step 3: Add Edge Case Tests

Asking Copilot for Edge Cases

After generating basic tests, ask specifically for edge cases:

"What edge cases are not covered in these tests for
the calculateTotal function? Consider:
- Boundary values (zero, negative, very large numbers)
- Type edge cases (null, undefined, NaN, empty string)
- Collection edge cases (empty array, single item, 1000 items)
- Concurrency (what if called simultaneously)
- Precision (floating point math for currency)"

Copilot typically identifies 5-10 additional edge cases per function.

Common Edge Case Categories

For each function, systematically check:

Category	Edge Cases	Example
Null/undefined	null input, undefined fields	createOrder(null)
Empty	empty string, empty array, empty object	calculateTotal([])
Boundary	zero, max int, min int	applyDiscount(0)
Type mismatch	string where number expected	calculateTotal("abc")
Special characters	unicode, SQL injection strings	search("'; DROP TABLE--")
Async timing	timeout, concurrent calls	Two orders for last item
Precision	floating point, rounding	0.1 + 0.2 !== 0.3

Generating Boundary Value Tests

"Generate boundary value tests for the validateAge function
that accepts an integer age parameter. The valid range is
0-150. Generate tests for: -1, 0, 1, 74, 75, 149, 150, 151,
null, undefined, NaN, Infinity, 3.5, and the string '25'."

Step 4: Generate Integration Tests

API Endpoint Tests

"Generate integration tests for the POST /api/orders endpoint.
Use supertest with the existing app instance from src/app.ts.

Test scenarios:
1. Successful order creation (201)
2. Missing required fields (400 with field-level errors)
3. Invalid product ID (404)
4. Insufficient inventory (409)
5. Unauthenticated request (401)
6. Unauthorized user role (403)
7. Rate limited request (429)
8. Database error (500)

For each test:
- Set up necessary test data (products, user)
- Make the request with appropriate headers
- Assert status code
- Assert response body structure
- Assert side effects (database state, events emitted)"

Database Integration Tests

"Generate integration tests for the OrderRepository.
Use the test database configured in src/test/setup.ts.

Test:
1. Create order and verify it exists in the database
2. Create order with line items (verify foreign keys)
3. Query orders with pagination
4. Update order status
5. Delete order (verify cascade to line items)
6. Concurrent order creation (verify inventory constraint)

Each test should clean up after itself. Use transactions
that roll back after each test."

Service Layer Integration Tests

"Generate tests that verify OrderService correctly
integrates with PaymentService and InventoryService.

Use partial mocks: mock PaymentService.processPayment
but use the real InventoryService against the test database.

Test the full flow:
1. Order succeeds (inventory decremented, payment processed)
2. Payment fails (inventory NOT decremented — verify rollback)
3. Inventory insufficient (payment NOT attempted)
4. Partial fulfillment (some items available, some not)"

Step 5: Review and Refine Generated Tests

The Test Quality Checklist

Generated tests often have these issues:

Issue 1: Tautological tests (testing nothing)

// BAD: This test always passes
test('should return result', () => {
  const result = add(2, 3);
  expect(result).toBeDefined(); // Does not verify correctness
});

// GOOD: This test verifies actual behavior
test('should return sum of two numbers', () => {
  expect(add(2, 3)).toBe(5);
});

Issue 2: Testing implementation, not behavior

// BAD: Brittle — breaks when implementation changes
test('should call database.query with SELECT', () => {
  userService.findById('123');
  expect(database.query).toHaveBeenCalledWith(
    'SELECT * FROM users WHERE id = $1', ['123']
  );
});

// GOOD: Tests behavior, not SQL query text
test('should return user by ID', () => {
  const user = await userService.findById('123');
  expect(user.id).toBe('123');
  expect(user.name).toBeDefined();
});

Issue 3: Missing assertions

// BAD: No assertion — test passes even if function throws
test('should process order', async () => {
  await orderService.process(mockOrder);
});

// GOOD: Verify the outcome
test('should process order and update status', async () => {
  const result = await orderService.process(mockOrder);
  expect(result.status).toBe('processed');
  expect(result.processedAt).toBeDefined();
});

Review Workflow

For each generated test file:

Run the tests: Do they all pass? If not, are the failures due to test bugs or implementation bugs?
Mutation test: Introduce a deliberate bug in the implementation. Does at least one test fail? If no test catches the bug, the tests are not testing the right things.
Read each assertion: Is it testing something meaningful? Could the test pass even if the function is broken?
Check mock setup: Are mocks realistic? Do they return data that matches what the real dependency would return?
Verify cleanup: Do tests clean up after themselves? Running the suite twice should produce the same results.

Step 6: Establish a Testing Workflow

Pre-Commit: Generate Tests for New Code

Developer workflow:
1. Write the implementation
2. Ask Copilot: "Generate tests for the functions I just added"
3. Review and refine the generated tests
4. Run coverage to verify the new code is tested
5. Commit implementation and tests together

Sprint Planning: Allocate Test Coverage Time

Each sprint:
- 10% of sprint capacity allocated to test improvement
- Developer selects lowest-coverage service from the report
- Uses Copilot to generate tests for that service
- Reviews, refines, and merges
- Coverage improves incrementally each sprint

Coverage Target Progression

Month 1: 40% → 55% (focus on critical business logic)
Month 2: 55% → 65% (add error handling paths)
Month 3: 65% → 75% (add edge cases and integration)
Month 4: 75% → 80% (fill remaining gaps)
Month 5+: Maintain 80%+ (test new code as it is written)

Going from 0% to 80% coverage takes approximately 4-5 months with Copilot assistance (roughly 60% faster than manual test writing).

Measuring Test Quality Beyond Coverage

Coverage Is Necessary but Not Sufficient

100% code coverage with weak assertions is worse than 70% coverage with strong assertions. Track these additional metrics:

Metric	What It Measures	Target
Line coverage	Which lines execute during tests	80%+
Branch coverage	Which if/else branches are tested	75%+
Mutation score	What % of injected bugs are caught	70%+
Test-to-code ratio	Lines of test vs. lines of code	1:1 to 2:1
Test execution time	How long the suite takes	Under 5 min for unit tests

Mutation Testing

Use mutation testing to verify test quality:

# JavaScript (Stryker)
npx stryker run

# Python (mutmut)
mutmut run

# Go (go-mutesting)
go-mutesting ./...

Mutation testing changes your code (mutates a + to -, removes a condition, changes a return value) and checks if any test fails. If no test catches the mutation, your tests have a gap.

Frequently Asked Questions

Does Copilot generate tests that actually catch bugs?

Yes, when the tests have meaningful assertions. Copilot generates structurally correct tests, but you must verify that the assertions check real behavior. Review every assertion — accept the structure, validate the content.

Which testing framework does Copilot work best with?

Copilot works well with all major frameworks: Jest, Mocha, Vitest (JS/TS), pytest (Python), Go testing (Go), JUnit (Java), RSpec (Ruby). It adapts to your project’s existing test setup.

How do I handle test generation for private functions?

Test private functions through the public API that calls them. Ask Copilot: “Generate tests for the public methods of this class that exercise the private helper functions through the public interface.”

Can Copilot generate tests for legacy code with no existing tests?

Yes. This is one of Copilot’s strongest use cases. Point it at untested legacy code and ask for characterization tests (tests that document current behavior without judgment about correctness). These become the safety net for future refactoring.

Should I commit AI-generated tests as-is?

Never. Always review, run, and refine before committing. Generated tests are first drafts — they need the same review as generated production code.

How do I test async code with Copilot?

Copilot handles async well if you specify the framework. “Generate tests for this async function using Jest with async/await syntax. Test both the success and rejection paths of the Promise.”

Explore More Tools