Project P4: Software Testing¶

Learning Objectives¶

Students will be able to...

Collaborate with LLMs to generate unit tests that achieve a desired level of code coverage.
Run test suites locally using a test harness.

Instructions¶

In P1, you created requirements, UX storyboards and wireframes, and an architecture document for your project. In P2, you used an LLM to implement the frontend of your application based on these artifacts while stubbing the backend. In P3, implemented the backend of your application and integrated it with the frontend. In P4, you will write unit tests for your frontend and backend files/classes.

Part 1: Implement the Tests¶

By the end of this sprint, you'll create unit tests your application's features. You will also create a test script to run the tests on your local machine.

Use LLMs to build up your project's unit tests one test case at a time. For each test to be built, include the relevant user stories, wireframes, and architecture information in the LLM's context. The aim here is to do something akin to spec-driven development.

LLM-Driven Coding: Remember, use the LLMs as much as possible to generate the deliverables. You may not modify any generated graphics or code directly, only by prompting the LLM.
Git/GitHub: Use Git for version control for your team's sourcecode and other artifacts. Create repositories in the course GitHub organization to push your work to. Use GitHub pull requests (PRs) to submit code to your team's project.
Code Review: Make sure that each PR is reviewed by at least one team member other than the author before the it is accepted. Reviewers should use LLMs to assist them.
Collaborative Development: Each team member must take the lead on the creation of at least one substantive pull request. It is acceptable to pair program throughout the project.
Evolving Specs: It is expected that your project's specs will change and evolve as you work on the project. Be sure to manage the spec artifacts (e.g., with respect to version control and collaboration) the same way you would your code artifacts.
Be Vigilant: Watch out for nonsensical test cases or duplicate or significantly overlapping test cases. The LLM must not generate test cases for functions and functionalities that do not exist.
README: Add instructions regarding how to run the tests to the project README.md. If there are external dependencies (e.g., libraries/packages), be sure to mention those as well.
AI Chat Transcripts: For each PR, attach or link to all AI transcripts that produced the code and other artifacts being submitted in the PR.

For each function you aim to unit test, follow these steps:

Write a test specification: A test specification is English-language text that describes the purpose of the function to be tested along with every program path that should be tested with a unique unit test. For each unit test, the document should describe the inputs to the function that are required to engage the desired program path and the expected output. Each specification should include a table of tests. Each row of the table should describe the purpose of the test, the test inputs to the function, and the test output that is expected if the test passes. You must write at least one unit test for every function. Your goal should be 100% statement coverage (or close to it), meaning that for each function to be tested, your unit tests will execute each line of code in the function at least once. Don't forget about exceptions. Note that you will need to use a tool to calculate code coverage — the LLM will not be able to do it.
Implement the test spec by creating Javascript (or Typescript) unit tests: Ensure that your tests are isolated to the frontend or backend. That is, they should not test functionality that requires connecting across the network from the frontend to the backend or vice versa. If the function under test requires connecting to the other end, you must create mock objects that simulate the other end's public interface.
Implement and run a test harness: Create scripts to setup the application and run the unit tests on your local machine. Ask the LLM to generate an npm script to setup the frontend or backend of your application, as needed, and then execute the tests with your testing framework.
Fix bugs revealed by failed tests: When you ran your tests, did any of them fail? If not, use your LLM to give you a plan on how it wants to fix the bugs (ask it for three alternative fixes). Choose the bug fix you like and have the LLM make the change. Did your test case pass? Congratulations! If not, try again.

Testing Frameworks:

JavaScript/TypeScript: Most projects that are written in Javascript or TypeScript should use the Jest unit testing framework. You may use another framework, but you may not simply manually test the code. You must use a testing framework. Jest can also be used to create mocks and to check code coverage.
VS Code Extensions: If you are building a VS Code extension, you must use VS Code's preferred testing framework, Mocha. See VS Code's Extension Testing Instructions on how to set it up. Mocha uses SinonJS to generate mocks and Instanbul for code coverage.

Part 2: Demo and Reflection Video¶

Once all team members have submitted their contributions, record a demonstration and reflection video as follows.

Record Teams Meeting: To create the video gather your team together in a Teams meeting and record the meeting. Use screen sharing to capture app demos and slide presentations in the recording. Everyone must have their webcams on, so it can be seen that everyone is present and participating.
Test Tour and Demo: To begin the video, one team member must give a brief "nickel tour" of team's test code and demonstration that the test run and pass. Limit the demo to no more than 8 minutes.
Reflection: Take turns with your teammates presenting answers to the reflection questions below. You may optionally use slides or your app as visual aids to support the reflections. Each team member must present the reflection for at least one question. As a team, agree ahead of time as to what the response to each question will entail.

Reflection Questions:

How effective was the LLM in generating the test specification?
What did you like about the result?
What was wrong with what the LLM first generated?
What were you able to fix it easily?
What problems were more difficult to fix?
How effective was the LLM in generating unit tests from the test specification?
What did you like about the result?
What was wrong with what the LLM first generated?
What were you able to fix it easily?
What problems were more difficult to fix?
How did you verify that the LLM correctly did what you asked?
How did you use the test framework or the LLM to help you understand if everything was done correctly?

How to Submit¶

Push all of the spec and code artifacts to your team's GitHub repository (or repositories).
Save your video to OneDrive (or Microsoft Clipchamp). Give instructors read permissions. Remove expiration (if relevant).
In Canvas, submit the URL(s) of the repository (or repositories) and the video.

Grading Rubric¶

Project assignments are graded as High-Pass/Low-Pass/Fail.

High-Pass: Pass score on all parts with High-Pass on half or more.
Low-Pass: Pass score on all parts.
Fail: Fails to meet requirements for Low-Pass.

Part 1 Rubric: Implement the Tests¶

High-Pass: Pass score on all subparts with High-Pass on half or more.
Low-Pass: Pass score on all subparts.
Fail: Fails to meet requirements for Low-Pass.

Subpart 1-1 Tests:

High-Pass: A suite of unit tests were implemented that achieve 100% statement coverage. The tests all pass. The tests can be ran and statement coverage can be calculated by following the instructions in the top-level README.md Markdown file. Test specifications were included in the PRs and/or in the project artifacts.
Low-Pass: Unit tests mostly met the quantity and quality criteria, but a few glaring issues were evident. Statement coverage may not have been calculable. Test specifications may have been missing. Instructions in the README.md file mostly met expectations but contained a few glaring omissions or errors.
Fail: Fails to meet the requirements for Low-Pass.

Subpart 1-2 Collaborative AI-Assisted Development:

This subpart is graded individually for each team member.

High-Pass: The team member led at least one substantial PR that was reviewed and merged into the main branch. The PR included all relevant AI chat transcripts. The team member reviewed at least one PR created by another team member.
Low-Pass: The team member led a substantial PR, but there were a few glaring issues with the quality of the PR and/or with the AI chat transcripts. The team member failed to review another team member's PR.
Fail: Fails to meet the requirements for Low-Pass.

Part 2 Rubric: Demo and Reflection Video¶

High-Pass: Pass score on all subparts with High-Pass on half or more.
Low-Pass: Pass score on all subparts.
Fail: Fails to meet requirements for Low-Pass.

Subpart 1-1 Test Tour and Demo:

High-Pass: Video tour and demonstration of the team's tests is presented such that the test code is well covered, the tests are run, and the demo is 8 minutes or less in length.
Low-Pass: Video tour demonstration of the tests mostly meets expectations; however, there are a few glaring issues with the test-code coverage, and/or the demo is noticeably longer than 8 minutes (but not longer than 16).
Fail: Fails to meet the requirements for Low-Pass.

Subpart 1-2 Reflection Presentation:

This subpart is graded individually for each team member.

High-Pass: The team member presents answers to a subset of the reflection questions such that the presentation is clear, thoughtful, and connects to the team members' experiences.
Low-Pass: The team member presents answers to reflection questions; however, there are minor deficiencies with the quality of their presentation.
Fail: Fails to meet the requirements for Low-Pass.