SpecFlow is an open-source tool that aids in Behavior-Driven Development (BDD) for .NET applications. It bridges the gap between technical and non-technical stakeholders by allowing developers to write executable specifications in a Gherkin language:
Given linkedin.com is opened in a browserWhen a user logs in with incorrect passwordThen the following error message is shown: "🤬"
🥒 Gherkin
A test engineer writes the needed glue code that can translate the Gherkin language into code, so a computer can execute it.
By leveraging Gherkin language, SpecFlow enables clear communication and collaboration among team members, ultimately enhancing the quality and reliability of software projects.
SpecFlow is dead?
Limited Updates: SpecFlow has seen a reduction in active development and fewer updates, as evidenced by the release history on GitHub, with the last stable release in May 2022 and the last beta version in February 2023:
Forked Project: Although SpecFlow is open-source, the name “SpecFlow” is trademarked by Tricentis. Consequently, the SpecFlow project was forked on GitHub and renamed ReqnRoll (which amusingly reminds me of a “Rick Roll“)
This looks dire for SpecFlow.
To Upgrading or not to upgrade?
I asked ChatGPT 4o this question:
SpecFlow seems to be dead.What would the reasons be to upgrade or not to upgrade to ReqnRoll?Can you write it as short bullet points?
📝 AI prompt
and got the following answer:
Reasons to Upgrade to ReqnRoll* Active Development: Regular updates and support.* New Features: Enhanced functionality and tools.* Active Community: More resources and community support.* Better Compatibility: Improved integration with modern tools.* Performance Boost: Faster and more efficient testing.Reasons Not to Upgrade to ReqnRoll* Migration Effort: Time-consuming and costly transition.* Stability: SpecFlow might still meet current needs.* Integration Issues: Potential disruptions with existing tools.* Feature Gaps: ReqnRoll may lack some SpecFlow functionalities.* Learning Curve: Need for retraining and adjustment.
🤖 AI response
I would like to add that:
Security Concerns: A test framework is not in production, so lack of security updates is not as big of a problem.
Proven Stability: The current version has been working for 2 years, so it will probably work for some time yet.
ReqnRoll is a new tool: Not all features have been migrated from SpecFlow to ReqnRoll. This is covered partially by “Feature Gaps” point.
So, an upgrade is not urgent.
An important question is: “Why fix it, when it ain’t broken?”
Answer: Because everything else is upgraded, like .NET and dependencies. When something finally breaks and we have waited too long, an upgrade might not be possible anymore, and a lot may need to be re-coded. This is a very high risk, even though it is not yet urgent.
How to upgrade an existing test setup
The following approach can be used for
SpecFlow → ReqnRoll
SpecFlow → something else
Something third →something forth
I asked ChatGPT 4o again:
any suggestions on how to upgrade an existing test suite from one frameworkto another? Please answer in steps and each step must be short.
The key point here is, the more we use something, the better we understand it.
So, start with the basics and make a pilot migration on the most critical test cases to experience how they perform.
How to upgrade from one framework to another
When we have an older working test suite, then please keep it and use it.
I have learned from experience that a larger upgrade can end up blocking new tasks. Why? Because all the time is used on making the test suite work again.
When a larger update needs to happen, then create a copy of the test suite (be it a new branch, fork, or a new completely new repository.)
Run both the legacy test suite and the new one.
When something breaks in the legacy test suite, then ask yourself: Can I fix it in max. 60 minutes? If yes, then fix it in the old test suite. If not, then migrate the test case to the new test suite and deprecate it in the legacy test suite.
When the test team has extra time (I know: imaginary thinking 😅), then migrate some test cases to the new test suite (start with the most critical ones).
Evaluate automatic migration vs. manual migration. Sometimes a script can migrate 1000’s of test cases in a short time. Sometimes it is simply cheaper to let people manually copy paste, than to make a script work.
Deprecate the old test suite, when it’s not needed anymore.
How to train the test team
Remember to train the team to use the new test suite.
1. Example test cases
Migrate yourself some of the test cases, so they can be inspired from your code.
2. Unit-test of helper classes as live-documentation
Remember to make unit-tests on your helper classes, to show exampled of how the helper classes are supposed to be used (have to already made unit-tests in the legacy test suite, then it will be easy to recreate the new helper classes in the new test suite).
Proper unit-tests can be used as documentation! And also make it easier to migrate from one framework to another.
Alt explainCode2CodePartial transGherkin2CodeCode2GherkinAI debug
Need help? Click anyone of them to get a guide, on how AI can help you. (Opens in a new window)
It can translate code from one coding language to another (and from one test framework to another!)
It can help translate a piece of code to Gherkin, especially good for the short cryptic lines of code:
Please translate the following C# Code into Gherkin:string result = $"{10:D5}";
📝 AI prompt
Translated into Gherkin:
Scenario: Formatting the integer 10 with leading zeros to ensure it is 5 digits longGiven an integer value of 10When the integer is formatted using the pattern "D5" within an interpolated stringThen the result should be a string "00010"
🤖 AI response
How cool is that?
Conclusion
Selecting a framework is done by testing it out Changing a framework don’t have to be done at once Training the test team can be done by examples, unit-tests and AI.
Our world is already changing fast. AI will make it changer even faster. AI (with AI-learning) will also help us adapt faster.
Need a better translation or explanation? Click on the image, on how AI can help you. (Opens in a new window)
Introduction on how to write a test before code
It can be very difficult to grasp, on how to write a test case before writing code. Many ask: “How can you test something that doesn’t exist yet?”
It is because we need to view development from a different angle. Software is the logic that connects the expected inputs with the expected outputs.
In test driven development (TDD) we take one input and output and write them down as a test case. This test case will fail, since there is no code to support it (❌ RED) Then we can write the needed code/logic to make the test pass (✅ GREEN)
Then we take another input and output and write it down as another test case. This test case will most likely also fail (❌ RED). We will then update the code/logic to make the test pass (✅ GREEN). Then we need to make sure that all the previous test cases also pass. If not, then we need to refactor the code to make them pass (♻️ Refactor)
This is where ❌ RED → ✅ GREEN → ♻️ Refactor comes from.
Let’s try to implement a calendar. A calendar is something everybody can use, but it is not as easy to specify.
Choose to either A or B:
A – Waterfall Method: Try to write down a specification for a calendar. (use 5 and 15 min.)
B – Cowboy Coding: Try to code the calendar without any planning. (use between 5 and 15 min.)
Then review your specification with the following questions:
How easy was it?
How do you know, if you have all the critical parts?
What parts are non-critical?
How do you know a part is critical or not?
Fun observations
I prefer to call the Waterfall method for “Waterfall Model Development (WMD)” because the acronym also means Weapons of Mass Destruction and it is not entirely false 😂)
But Cowboy Coding is not much better: “Days of coding can save you minutes of planning” 😂😂
Waterfall method and Cowboy coding both bring their own value to the table. TDD tries to combine them into tiny waterfalls, where each test case is a waterfall method. (we could call TDD for Minimum Viable Waterfalls 😂😂😂)
Minimum Viable Product
TDD is about making a minimum viable product, because we start with the most important test case and then add another test cases, then another.
Of course, we might learn under the development that something was more important than we thought previously, which we then write down as a test case, so it can be implemented. We can always remake the priority list (if we wrote the priority down).
Our program will then grow, based on our experience and by each test case we will learn what is actually critical for the product. When all the critical parts are implemented, then we will have the minimum viable product.
Let’s try that.
TDD method: Alternative to the waterfall and cowboy coding
Write down the most critical input and output that a calendar must handle. i.e. handling a date:
when we create the calendar with date "2024-07-12"then we must be able to print it as "2024-07-12"
🥒 Gherkin
This test will fail, until we create the Calendar class:
We don’t need to think about how to store the date values or anything yet. Currently the first test only needs to store a string value and be able to print it.
Adding Test Design Techniques
Then we add another critical input and output. This is where test design techniques are handy. Let start with boundary-value-analysis where we test for minimum and maximum value:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data"min. date" | "2024-01-01" | "2024-01-01""max. date" | "2024-01-31" | "2024-01-31"
🥒 Gherkin
This will pass, so the code doesn’t need to be updated.
Extend with a negative test
But with boundary-value-analysis we need to test below the minimum value, so we update the first test and add two more:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message"min. date" | "2024-01-01" | "2024-01-01" | null"max. date" | "2024-01-31" | "2024-01-31" | null"below min." | "2024-01-01" | null | "day below minimum 01"
🥒 Gherkin
The first two tests pass, while the 3rd test fails, until we update the class:
Python
classCalendar:def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")self.dateString = dateString def__str__(self):returnself.dateString
💻 Code
Extend with another negative test
Then we Boundary-Value-Analysis requires us to test for above the max. value:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message"min. date" | "2024-01-01" | "2024-01-01" | null"max. date" | "2024-01-31" | "2024-01-31" | null"below min." | "2024-01-01" | null | "day below minimum 01""above max." | "2024-01-32" | null | "day above maximum 31"
🥒 Gherkin
Which again fails and we need to update the class:
Python
classCalendar:def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")elif(int(day)>31):raiseValueError(f"The day value '{day}' is above maximum value '31'")self.dateString = dateString def__str__(self):returnself.dateString
💻 Code
Extend until fail
We will skip February to begin with. Then we will apply the boundary value analysis to March (which it will pass) and April (for which it will fail):
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message"Jan below min." | "2024-01-01" | null | "day below minimum 01""Jan min. date" | "2024-01-01" | "2024-01-01" | null"Jan max. date" | "2024-01-31" | "2024-01-31" | null"Jan above max." | "2024-01-32" | null | "day above maximum 31""Mar max. date" | "2024-03-31" | "2024-03-31" | null"Mar above max." | "2024-03-32" | null | "day above maximum 31""Apr max. date" | "2024-04-30" | "2024-04-30" | null"Apr above max." | "2024-04-31" | null | "day above maximum 30"
🥒 Gherkin
So, we update our Calendar class. We can also refactor the class with the method: “checkAboveMax()” to make it easier to understand:
Python
classCalendar: monthsWith31Days = ["01", "03"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)else:self.checkAboveMax(day, 30)self.dateString = dateString defcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):returnself.dateString
💻 Code
Extending and grouping
Now we can extend this for each other month (except February) and group them into 31 and 30 day months:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message"Jan below min." | "2024-01-01" | null | "day below minimum 01""Jan min. date" | "2024-01-01" | "2024-01-01" | null # 31 day months"Jan max. date" | "2024-01-31" | "2024-01-31" | null"Jan above max." | "2024-01-32" | null | "day above maximum 31""Mar max. date" | "2024-03-31" | "2024-03-31" | null"Mar above max." | "2024-03-32" | null | "day above maximum 31""May max. date" | "2024-05-31" | "2024-05-31" | null"May above max." | "2024-05-32" | null | "day above maximum 31""Jul max. date" | "2024-07-31" | "2024-07-31" | null"Jul above max." | "2024-07-32" | null | "day above maximum 31""Aug max. date" | "2024-08-31" | "2024-08-31" | null"Aug above max." | "2024-08-32" | null | "day above maximum 31""Oct max. date" | "2024-10-31" | "2024-10-31" | null"Oct above max." | "2024-10-32" | null | "day above maximum 31""Dec max. date" | "2024-12-31" | "2024-12-31" | null"Dec above max." | "2024-12-32" | null | "day above maximum 31" # 30 day months"Apr max. date" | "2024-04-30" | "2024-04-30" | null"Apr above max." | "2024-04-31" | null | "day above maximum 30""Jun max. date" | "2024-06-30" | "2024-06-30" | null"Jun above max." | "2024-06-31" | null | "day above maximum 30""Sep max. date" | "2024-09-30" | "2024-09-30" | null"Sep above max." | "2024-09-31" | null | "day above maximum 30""Nov max. date" | "2024-11-30" | "2024-11-30" | null"Nov above max." | "2024-11-31" | null | "day above maximum 30"
🥒 Gherkin
Where we need to update our class again (just a single line):
Python
classCalendar: monthsWith31Days = ["01", "03", "05", "07", "08", "10", "12"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)else:self.checkAboveMax(day, 30)self.dateString = dateString defcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):returnself.dateString
💻 Code
Extend with exceptions
Then we can extend with other than 31 and 30 days (which is February):
classCalendar: monthsWith31Days = ["01", "03", "05", "07", "08", "10", "12"] monthsWith30Days = ["04", "06", "09", "11"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)elif(month inself.monthsWith30Days):self.checkAboveMax(day, 30)else:self.checkAboveMax(day, 29)self.dateString = dateString defcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):returnself.dateString
💻 Code
More exceptions:
Then we can extend with leap year and not leap year:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message... # other"Feb max. date (leap year)" | "2024-02-29" | "2024-02-29" | null"Feb above max.(leap year)" | "2024-02-30" | null | "day above maximum 29""Feb max. date (not leap)" | "2025-02-28" | "2025-02-28" | null"Feb above max.(not leap)" | "2025-02-29" | null | "day above maximum 28"
🥒 Gherkin
And update the class to deal with the leap year:
Python
classCalendar: monthsWith31Days = ["01", "03", "05", "07", "08", "10", "12"] monthsWith30Days = ["04", "06", "09", "11"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)elif(month inself.monthsWith30Days):self.checkAboveMax(day, 30)else:if(self.isLeapYear(year)):self.checkAboveMax(day, 29)else:self.checkAboveMax(day, 28)self.dateString = dateString defisLeapYear(self, year):returnint(year)%4==0defcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):returnself.dateString
💻 Code
Is this critical? Then extend even more!
As we can see, the class has the minimum viable product in it and we can always ask: “what more are we missing?” and “is it critical?”
We could add the centurial leap year exception, when a leap year ends on “xx00”, i.e.: year 2100, 2200, 2300, then a leap year is not a leap year (yes, it is part of the Gregorian calendar).
The question is always: “Is the century leap year thing important for our application?”
If our app is not needed after the year 2099, then maybe this is not critical? No need to add test cases that will never be needed or maintain the code for it them.
Let’s say we need them, so we add them to our tests:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message ... # other"Feb max. date (leap year)" | "2024-02-29" | "2024-02-29" | null"Feb above max.(leap year)" | "2024-02-30" | null | "day above maximum 29""Feb max. date (not leap)" | "2025-02-28" | "2025-02-28" | null"Feb above max.(not leap)" | "2025-02-29" | null | "day above maximum 28""Feb max. date (centurial year, not leap)" | "2100-02-28" | "2100-02-28" | null"Feb above max.(centurial year, not leap)" | "2100-02-29" | null | "day above maximum 28"
🥒 Gherkin
The “centurial year, not leap” test cases will fail again, and we can refactor our isLeapYear method in our Calendar class, to make it pass without breaking any test cases:
Python
classCalendar: monthsWith31Days = ["01", "03", "05", "07", "08", "10", "12"] monthsWith30Days = ["04", "06", "09", "11"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)elif(month inself.monthsWith30Days):self.checkAboveMax(day, 30)else:if(self.isLeapYear(year)):self.checkAboveMax(day, 29)else:self.checkAboveMax(day, 28)self.dateString = dateString defisLeapYear(self, year): output =Falseif(int(year)%4==0):if(int(year)%100!=0): output =Truereturn outputdefcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):
💻 Code
How deep does the rabbit whole go?
The centural leap year has an exception, that when a year ends on x400, like 1200, 1600, 2000, 2400, then a leap year (which is not a leap year, because it ends on xx00) is a leap year anyway 😂😂😂).
Again we must ask us the question: “is this critical for our app?” If our app is not needed for the year 2000 or 2400, then there is no need to implement it.
Let’s accept again that we need the app to work for the year 2000, so we implement the test cases:
when we create the calendar with date <input_date>then we must be able to print it as <expected_data>Examples description | input_date | expected_data | error_message ... # other"Feb max. date (leap year)" | "2024-02-29" | "2024-02-29" | null"Feb above max.(leap year)" | "2024-02-30" | null | "day above maximum 29""Feb max. date (not leap)" | "2025-02-28" | "2025-02-28" | null"Feb above max.(not leap)" | "2025-02-29" | null | "day above maximum 28""Feb max. date (centurial year, not leap)" | "2100-02-28" | "2100-02-28" | null"Feb above max.(centurial year, not leap)" | "2100-02-29" | null | "day above maximum 28""Feb max. date (centurial year, leap)" | "2000-02-29" | "2000-02-29" | null"Feb above max.(centurial year, leap)" | "2000-02-30" | null | "day above maximum 29"
🥒 Gherkin
Which again will require refactoring the isLeapYear method:
Python
classCalendar: monthsWith31Days = ["01", "03", "05", "07", "08", "10", "12"] monthsWith30Days = ["04", "06", "09", "11"]def__init__(self, dateString): year, month, day = dateString.split("-")if(int(day)<1):raiseValueError(f"The day value '{day}' is below minimum value '01'")if(month inself.monthsWith31Days):self.checkAboveMax(day, 31)elif(month inself.monthsWith30Days):self.checkAboveMax(day, 30)else:if(self.isLeapYear(year)):self.checkAboveMax(day, 29)else:self.checkAboveMax(day, 28)self.dateString = dateString defisLeapYear(self, year): output =Falseif(int(year)%4==0):if(int(year)%100!=0): output =Trueelse:if(int(year)%400==0): output =Truereturn outputdefcheckAboveMax(self, day, max):if(int(day)>max):raiseValueError(f"The day value '{day}' is above maximum value '{max}'")def__str__(self):returnself.dateString
💻 Code
More features
Todays date: Method to get todays date.
Date Arithmetic: Methods for plusDays, plusMonths, plusYears, minusDays, minusMonths, and minusYears.
Refactor the date storage from string to ints: Instead of storing “2024-07-12” we would store year: 2024, month: 7, day: 12.
Support for adding/subtracting weeks: plusWeeks and minusWeeks.
Date Comparison: Methods to compare dates, such as isBefore, isAfter, isEqual.
Method to check if a date is within a certain range:isWithinRange(startDate, endDate).
Day of the Week: Method to get the day of the week for a given date: getDayOfWeek.
Date Formatting: Method to format dates in different styles: formatDate(style).
Flexible Date Parsing: Handle various date formats, such as “2024-7-01”, “2024-07-1”, “24-07-01”, “July 1, 2024”, “1st July 2024”, and “01/07/2024”.
Duration Calculation: Method to calculate the duration between two dates in days, months, years.
etc.
There are many possibilities, the question is: “what is critical?”
TDD is a risk based testing approach!
Rapid prototyping
We all thought we knew what a calendar was, but how many people were able to specify it in the first waterfall/cowboy coding exercise?
Knowledge is not a boolean (either true or false). Knowledge is more a taxonomy, where you can ask yourself the following questions:
Can you recognize it, when you see it?
Can you operate it?
Can you repair it?
Can you build it from scratch?
Most people can recognize a helicopter. Few people can fly it. Fewer can repair it. Even less can build it from scratch.
The same goes for coding. Often we have no idea how something should be coded. Sometimes we have a feeling of how it should be coded. Sometimes we have no idea where to start.
TDD lets us experience something one test at a time, so it becomes more clear with each test we make.
It will not be perfect, but often “something” is better than “nothing”.
One could call it “Rapid prototyping,” where each new test case is a tiny prototype that needs to be tested. Or as we mentioned yearlier: “Minimun viable waterfalls” 😂😂😂
Conclusion
Let’s wrap up our journey into Test-Driven Development (TDD), shall we? 🚀
Introduction to Writing Tests Before Code We have learned that writing test cases before writing code requires a different perspective, viewing development as the logic connecting expected inputs with expected outputs.
TDD Cycle We have learned the iterative cycle of TDD: writing a test case (❌ RED), writing code to pass the test (✅ GREEN), and then refactoring the code (♻️ Refactor) to ensure all tests pass.
Comparison with Other Methods We have learned that TDD offers a balanced approach compared to the Waterfall Method and Cowboy Coding, combining the structured planning of Waterfall with the flexibility of Cowboy Coding.
Creating a Minimum Viable Product (MVP) We have learned that TDD focuses on creating a minimum viable product by starting with the most critical test cases and progressively adding more, ensuring the product evolves based on real requirements.
Example: Calendar Application We have learned how to implement a calendar application using TDD, starting with simple date handling and progressively adding more complex requirements like boundary values, negative tests, and special cases like leap years.
Boundary-Value Analysis We have learned to use boundary-value analysis to test minimum and maximum values, ensuring our application handles edge cases correctly.
Negative Testing We have learned to extend our tests with negative cases, such as below minimum and above maximum values, to ensure the application correctly handles invalid inputs.
Extending with Exceptions We have learned to handle special cases, like different month lengths and leap years, by extending our test cases and updating the implementation accordingly.
Critical Considerations We have learned the importance of identifying and prioritizing critical functionalities, and considering the scope and longevity of the application when deciding which test cases to implement.
Refactoring We have learned to refactor our code to make it more understandable and maintainable, ensuring all test cases pass after each change.
Ensuring Comprehensive Coverage We have learned to ask critical questions about the application’s requirements and to extend our tests as necessary, ensuring comprehensive coverage of all important functionalities.
More Features We have explored additional features, like getting today’s date, based on critical needs. It’s a risk based testing approach.
Rapid Prototyping We have learned that knowledge in coding is not binary; it’s a spectrum. TDD allows us to experience and refine our understanding one test at a time, making the development process clearer with each test. This approach can be seen as “Rapid Prototyping,” where each new test case is a tiny prototype that needs validation. Or, as humorously mentioned, “Minimum Viable Waterfalls” 😂😂😂.
Need a better translation or explanation? Click on the image, on how AI can help you. (Opens in a new window)
Introduction to Unit Testing
So, what exactly is Unit Testing?
When we need to develop code, we first determine the inputs our code will receive and the outputs it should produce. Then, we write the required logic to process these inputs into the desired outputs.
A unit test is essentially a description of these inputs and the expected outputs. By writing unit tests, we can verify that the piece of code we are building fulfills these expectations. It’s like setting up mini-experiments to ensure each part of our program behaves correctly.
Unit testing ensures our code is reliable and maintainable, providing a solid foundation for our software projects. It’s like having a safety net that catches errors before they cause trouble, giving us peace of mind and making our coding adventures much smoother!
With Python we can use PyTest or RobotFramework. With Groovy we can use Spock. With JavaScript we can use Mocha and Chai. With C# I have no idea, but ChatGPT suggest: NUnit and xUnit.
The test framework is not that important, the most important part is to use it wisely.
Often it is better to use a simple test-framework and add new features others have developed or develop them yourself.
How to learn a test framework?
Testing is the basis for knowledge and one of the best ways to learn something. We simply make small tests (experiments) to see what, what succeeds, what fails, and what can be learned from it.
In the next steps I will give you some of these tests, so you can learn the test-framework you want in any coding language you want.
We will take it one test case at a time, just like we would develop with Test Driven Development :-)
I will first show the first unit-test in Python with pytest, and then show how AI can translate it to JavaScript with Mocha and Chai, then into Groovy and Spock.
Test 1: to make it pass
The first test we are going to make is to make it pass. We need to install the framework
pip install pytest
💻 Terminal input
and make a test.py file:
Python
import pytestdeftest_mustPass():assert1+2==3, "this test must pass, but it passed :("
💻 Code
To run it we need to:
pytest test.py
💻 Terminal input
Which will give the following result:
test.py . [100%]========================== 1 passed in 0.00s =============================
💻 Terminal result
Important: the method name needs to a prefix: text_ otherwise pytest will not run this method as a test.
Other languages and test frameworks
With a simple AI prompt:
Can you translate the following code into Javascript with Mocha and Chai?(please use suite, test, and assert, instead of describe, it, and expect)"""pythonimport pytestdef test_mustPass(): assert 1+2==3, "this test must pass, but it passed :(""""
📝 AI prompt
constassert=require('chai').assert;suite('Simple Addition Tests',function(){test('mustPass',function(){assert.equal(1+2,3,"this test must pass, but it passed :(");});});
Alt explainCode2CodePartial transGherkin2CodeCode2GherkinAI debug
Need help? Click anyone of them to get a guide, on how AI can help you. (Opens in a new window)
Test 2: to make it fail
The second test needs to fail, so we can see how it fails! So, we add another test case:
Python
import pytestdeftest_mustPass():assert1+2==3, "this test must pass, but it passed :("deftest_mustFail():assert1+2==4, "this test must fail, and it did :)"
💻 Code
Which will give the following result:
====================== short test summary info ===========================FAILED test.py::test_mustFail - AssertionError: this test must fail, and it did :)===================== 1 failed, 1 passed in 0.01s ========================
💻 Terminal result
This is of course a manual test.
Test 3: to make it crash (neither pass or fail)
The 3rd test needs to cast an exception, so it can’t complete.
Python
import pytestdeftest_mustPass():assert1+2==3, "this test must pass, but it passed :("deftest_mustFail():assert1+2==4, "this test must fail, and it did :)"deftest_mustCrash():raiseRuntimeError("CRASH! ;-)")assert1+2==3, "this must crash, but it failed :("
💻 Code
Which will give the following result:
====================== short test summary info ===========================FAILED test.py::test_mustFail - AssertionError: this test must fail, and it did :)FAILED test.py::test_mustCrash - RuntimeError: exception===================== 2 failed, 1 passed in 0.01s ========================
💻 Terminal result
Pytest don’t show much of a difference between these two fails, but one is an AssertionError (test failed) and the other one RuntimeError (test not completed / crashed / stopped). Some frameworks give these a different colors / icon like
✔️ passed
❌ failed
⚠️ not completed / crashed / stopped
🏋️ Exercise
Take the programming language of your choice. Select a test framework for it (ask AI like ChatGPT about it, if you don’t know any). Make 3 test cases one that passes, one that fails, and one that crashes. Be inspired by
Test 4: Let’s automate the passed, failed, and crashed test
Testing how something works is a great way to learn. We test how things work, and how they fail so we can recover better from the fails.
To remember or share our knowledge, we can write it down as documentation. A great way is to do documentation as unit-tests, because we can run them.
When all the unit-tests passes, then the documentation is up-to-date. When a unit-test fails/crashes, then the documentation needs to be updated.
This is called live-documentation, because the documentation is alive and evolving with the system under development.
Let’s try with the 1st test (pass):
In order to automate our test cases, we first need to rename the prefix test_, so pytest won’t run them automatically.
Python
import pytestdeftoBeTested_mustPass():assert1+2==3, "this test must pass, but it passed :("
💻 Code
Then we can add a new unit-test that we want pytest to run, so it must start with test_ prefix:
Python
import pytestdeftoBeTested_mustPass():assert1+2==3, "this test must pass, but it passed :("deftest_toBeTested_mustPass(): toBeTested_must_pass()assertTrue
💻 Code
This will of course pass:
test.py . [100%]========================== 1 passed in 0.00s =============================
💻 Terminal result
We write assert True in the end, because if the toBeTested_must_pass would fail, then the test would stop there.
Let’s try to to make it fail, by replacing toBeTested_mustPass with toBeTested_mustFail and experience what happens:
Python
import pytestdeftoBeTested_mustFail():assert1+2==4, "this test must fail, and it did :)"deftest_toBeTested_mustPass(): toBeTested_mustFail() # we changed this line from pass to failassertTrue
💻 Code
====================== short test summary info ===========================FAILED test.py::test_toBeTested_mustPass - AssertionError: this test must fail, and it did :)========================== 1 failed in 0.01s =============================
💻 Terminal result
Let’s try to contain the 2nd test (fail):
We need to use try and except (in other languages it is called try and catch):
Python
import pytestdeftoBeTested_mustFail():assert1+2==4, "this test must fail, and it did :)"deftest_toBeTested_mustFail():# Given errorMessage =None# When try: toBeTested_must_fail()exceptAssertionErroras e: errorMessage =str(e)# Thenassert errorMessage =="this test must fail, and it did :)\nassert (1 + 2) == 4"
💻 Code
test.py . [100%]========================== 1 passed in 0.00s =============================
💻 Terminal result
Many test frameworks contain error and exception handlers that can be used instead:
Python
import pytestdeftoBeTested_mustFail():assert1+2==4, "this test must fail, and it did :)"deftest_toBeTested_mustFail():with pytest.raises(AssertionError, match=r"this test must fail, and it did :\)\nassert \(1 \+ 2\) == 4"): toBeTested_must_fail()
💻 Code
But I really dislike it, because the readability is horrible.
I often set this rule: when code can’t be split into given when then parts, then the readability becomes harder. It is like with regular language:
When the function is called and expected to fail with an AssertionErrorThen the error message must match the expected message
🥒 Gherkin
The function is called and expected to fail with an AssertionError, and the exception message should match the expected regular expression
🥒 Gherkin
Both are readable, but I will let you decide which one is easier to read.
I know that some programmers will disagree with me and that is fine. We all have preferences and different contexts to work in.
Let’s try to contain the 3nd test (crash):
It is almost the same, except that we try to contain the RuntimeError instead of the AssertionError
Python
deftoBeTested_mustCrash():raiseRuntimeError("CRASH! ;-)")assert1+2==3, "this must crash, but it failed :("deftest_toBeTested_mustCrash():# given errorMessage =None# whentry: toBeTested_must_crash()exceptRuntimeErroras e: errorMessage =str(e)# thenassert errorMessage =="CRASH! ;-)"
💻 Code
test.py . [100%]========================== 1 passed in 0.00s =============================
💻 Terminal result
It’s very straight forward.
🏋️ Exercise
Make a unit test for each of your 3 test cases that passed, failed and crashed.
Data driven testing
Sometimes it is better to have a single test that is data driven, than to have multiple tests.
The balance between them is the readability of the test.
Let’s try to make the 3 first test (pass, fail, crash) data driven.
The original tests looked like this:
Python
import pytestdeftest_mustPass():assert1+2==3, "this test must pass, but it passed :("deftest_mustFail():assert1+2==4, "this test must fail, and it did :)"deftest__mustCrash():raiseRuntimeError("CRASH! ;-)")assert1+2==3, "this must crash, but it failed :("
💻 Code
Which can be made into:
Python
import pytestfrom dataclasses import dataclass, field@dataclassclassTestCase: name: strinput: int expected: int raisesException: booltestCases = [ TestCase(name="1 must_pass", input=1+2, expected=3, raisesException=False), TestCase(name="2 must_fail", input=1+2, expected=4, raisesException=False), TestCase(name="3 must_crash", input=1+2, expected=4, raisesException=True),]ids = [testCase.name for testCase in testCases]params = [testCase for testCase in testCases]@pytest.mark.parametrize("testData", params, ids=ids)deftest_cases(testData:TestCase):# whenif testData.raisesException:raiseRuntimeError("CRASH! ;-)") else:assert testData.input==testData.expected, "this test must fail, and it did :)"
💻 Code
====================== short test summary info ===========================FAILED test.py::test_cases[2 must_fail] - AssertionError: this test must fail, and it did :)FAILED test.py::test_cases[3 must_crash] - RuntimeError: CRASH! ;-)===================== 2 failed, 1 passed in 0.02s ========================
💻 Terminal result
I like to use something called @dataclass that we can build test cases from and is supported by autocomplete!
Python
...from dataclasses import dataclass, field@dataclassclassTestCase: name: strinput: int expected: int raisesException: bool...
💻 Code
Then I can define my test cases as a list of TestCase objects:
Then it transforms the testCases into something pytest understands:
Python
...ids = [testCase.name for testCase in testCases]params = [testCase for testCase in testCases]@pytest.mark.parametrize("testData", params, ids=ids)deftest_cases(testData:TestCase):if testData.raisesException:raiseRuntimeError("CRASH! ;-)") else:assert testData.input==testData.expected, "this test must fail, and it did :)"
💻 Code
🏋️ Exercise
Make you 3 original test cases (that passed, failed, and crashed) to be data driven, so you can experience how a data driven test does all 3 things. (some test frameworks stops at the first fail, which is not good).
Making unit-tests understandable
It is really important to try to form the test cases, so it is easy to understand them.
A test case that is not understandable is valueless and we would be better without it.
If we take the example from previous chapter:
Python
import pytestfrom dataclasses import dataclass, field@dataclassclassTestCase: name: strinput: int expected: int raisesException: booltestCases = [ TestCase(name="1 must_pass", input=1+2, expected=3, raisesException=False), TestCase(name="2 must_fail", input=1+2, expected=4, raisesException=False), TestCase(name="3 must_crash", input=1+2, expected=4, raisesException=True),]ids = [testCase.name for testCase in testCases]params = [testCase for testCase in testCases]@pytest.mark.parametrize("testData", params, ids=ids)deftest_cases(testData:TestCase):# whenif testData.raisesException:raiseRuntimeError("CRASH! ;-)") else:assert testData.input==testData.expected, "this test must fail, and it did :)"
💻 Code
Then it is longer than the following example:
Python
import pytest@pytest.mark.parametrize("i, e, rte", [ (1+2, 3, False), # this test must pass (1+2, 4, False), # this test must fail (1+2, 3, True) # this test must crash ],ids=["must_pass", "must_fail", "must_crash"])deftest_cases(i, e, rte):if rte:raiseRuntimeError("CRASH! ;-)") else:assert i==e, "this test must fail, and it did :)"
💻 Code
Except, it is much harder to understand and maintain:
What does i, e, and rte mean?
the ids and the comments needs to be paired and updated manually.
The parameters can figured out, but with 7+ parameters, different value lengths, and 10+ test cases would make this hell to maintain!
So, please use:
Readable parameter names like: number1 + number2 == result and not n1+n2==r
Use group parameters into Inputs and Expected, so it is easy to understand what transformation needs to be done (not necessary how it is done) (In Python we can use @dataclasses
Use test id’s/description to easier navigate which tests has failed.
Try to create the context the test case needs to work within. Then it will be easier to understand, why something works the way it works.
Additionally use Skipped/Ignored category in case you want a test case skipped (described in Skipping tests)
🏋️ Exercise
Go through your previous exercises and evaluate if they are understandable – if not, then please improve them.
Skipping tests
Sometimes we can find a bug, and a test will fail.
There is a dangerous question to ask, that many have opinions about.
Imagine a tests starts to fail, because of a bug. What should we do?
Fix the bug as soon as possible!
Let the test fail, until it is fixed.?
Mark the test with @Ignore and a Jira-bug, so we can fix it soon.
I have tried multiple approaches and all of them has a price.
When it is not possible to fix all bugs, then we fix only the most critical ones.
When we are not in production yet, then a bug might be critical, but not urgent.
When we see a red test report, then we can get used to the red color. Non-critical bugs can make it impossible to see the critical ones.
So, a @Ignore of @skip function can be a good thing, as long as we remember to give it a comment or a link to a jira-story/bug.
In Pytest we can skip tests with:
Python
import pytestdeftest_mustPass():assert1+2==3, "this test must pass, but it passed :("@pytest.mark.skip(reason="Jira: bug-101")deftest_mustFail():assert1+2==4, "this test must fail, and it did :)"@pytest.mark.skip(reason="Jira: bug-102")deftest_mustCrash():raiseRuntimeError("CRASH! ;-)")assert1+2==3, "this must crash, but it failed :("
💻 Code
test.py . [100%]===================== 1 passed, 2 skipped in 0.01s =======================
💻 Terminal result
Then we can always read about the bug and status in the Jira bug.
A data driven test can be skipped a little differently, by adding a property to the TestCase called skip (line 10), assign it to each TestCase (line 13-15) and then add the skip option to the implementation (line 23-24):
Python
import pytestfrom dataclasses import dataclass, field@dataclassclassTestCase: name: strinput: int expected: int raisesException: bool skip: strtestCases = [ TestCase(name="1 must_pass", input=1+2, expected=3, raisesException=False, skip=None), TestCase(name="2 must_fail", input=1+2, expected=4, raisesException=False, skip="Jira: bug-101"), TestCase(name="3 must_crash", input=1+2, expected=4, raisesException=True, skip="Jira: bug-102"),]ids = [testCase.name for testCase in testCases]params = [testCase for testCase in testCases]@pytest.mark.parametrize("testData", params, ids=ids)deftest_cases(testData:TestCase):if(testData.skip): pytest.skip(testData.skip)if testData.raisesException:raiseRuntimeError("CRASH! ;-)") else:assert testData.input==testData.expected, "this test must fail, and it did :)"
💻 Code
test.py . [100%]===================== 1 passed, 2 skipped in 0.02s =======================
💻 Terminal result
🏋️ Exercise
Try to make a skip in one of your regular tests and one for the test-driven tests (you may only skip one of the sub-tests in the data-driven tests).
Reporting
Sometimes it can be a great idea to make a test report in html. It can be easier to get an quick overview or navigate through the test cases easier. For more complex tests it can also create a better overview in a visual day.
Which is much more readable. Especially when the complexity grows and we add i.e. timezones.
I will not go much into reporting in this chapter, but will write a separate one, which will contain all kind of good ideas, incl. testing of report templates.
Conclusion
Let’s wrap up our journey into Test-Driven Development (TDD), shall we? 🚀
Needed tools There are many programming languages and even more test-frameworks. To compare them better, it is recommended to test them out.
How to Learn a Test Framework We have learned that creating small, incremental tests helps us understand the outcomes and build knowledge of the test framework, similar to the principles of Test-Driven Development (TDD).
Test 1: To Make It Pass We have learned how to set up a basic test that is designed to pass, involving the installation of the framework, writing a simple test, and ensuring it runs successfully.
Test 2: To Make It Fail We have learned the importance of including a test case meant to fail, as it helps us understand how the test framework handles and reports failures.
Test 3: To Make It Crash We have learned to create a test that raises an exception to simulate a crash, which helps distinguish between assertion errors and runtime errors in the test results.
Automating Tests We have learned to automate tests by renaming methods to avoid automatic execution, using wrapper tests to verify behavior, and ensuring non-crashing tests pass successfully.
Containing Failures and Crashes We have learned to handle expected errors using try-except blocks (or try-catch in other languages) to manage assertion errors and runtime exceptions effectively.
Data-Driven Testing We have learned to consolidate multiple tests into a single parameterized test using @dataclass to define test cases, which enhances readability and maintainability.
Making Tests Understandable We have learned the importance of clear and maintainable tests by using descriptive parameter names, grouping inputs and expected results, and avoiding cryptic variable names.
Skipping Tests We have learned to mark tests that should not run due to known issues with @skip or @Ignore, and to provide reasons or links to bug tracking systems for reference.
Reporting We have learned the value of HTML reports for better visual representation and navigation of test results, especially useful in more complex scenarios.
Need a better translation or explanation? Click on the image, on how AI can help you. (Opens in a new window)
Intro
Let’s dive into the world of Test Driven Development (TDD) together, shall we? 🚀
So, what exactly is TDD? Well, it’s a way to describe things in a programming language.. Imagine it’s like telling a story, but instead of using words, we’re using code!
Here’s how it works: when we want to describe something using English, we focus on describing one thing at a time.
It’s like building with Lego bricks – we start with one piece and add more as we go along!
But here’s the cool part: just like in science like physics, we can even test our code, to see if it behaves the way we expect it to. It’s all about making sure our code is usable and understandable, just like in a science experiment!
So, think of TDD as our own little computer-science experiment. We write down our specifications and then make sure the computer behaves the way we want it to 🤖✨
First we split our code into 3 parts: “given”, “when”, and “then”
Python
# Given (where we set our preconditions)number1 =1number2 =2# When (where the action happens)result = number1 + number2# Then (where we assert/return our output)print(result)
💻 Code
With the print in the end, we can verify the result manually:
The “given”, “when”, and “then” is usually known in TDD as the triple A’s: “Arrange”, “Act”, and “Assert” and they do the same, except given, when, then is easier to read and arrange, act, assert is more technical.’
Python
# Given two numbers are defined# When the numbers are added together into the result# Then the result must be verified to be the sum# Arrange: two numbers are defined# Act: the numbers are added together to get the result# Assert: verify the result is the sum of the numbers
💻 Code
Assert instead of print
What we can do instead, is to assert it automatically, by replacing print with assert:
Python
# Givennumber1 =1number2 =2# When result = number1 + number2# Thenassert(result ==3)
💻 Code
The == means that we want the computer to verify if the equal is actually equal. It responds with either true (the equal is equal) or false (the equal is not equal). The assert consumes the true or false and if it’s true, then it does nothing, because everything is fine:
Python3 2024-06-03 on Linux x86_64)----
💻 Terminal result
If it’s true, then it the codes passes and everything is fine (even though we can’t see it, but no news is good news!).
Alt explainCode2CodePartial transGherkin2CodeCode2GherkinAI debug
Need help? Click anyone of them to get a guide, on how AI can help you. (Opens in a new window)
Assert & print
If we want a result we can print it:
Python
# Givennumber1 =1number2 =2# When result = number1 + number2# Thenprint(f"result: {result}")assert(result ==3)
💻 Code
The f in the f”…” means it is a text with parameters, where we can take the value of result. It has different names in different programming languages:
Language
Example
Name
Python
f"text {parameter] text"
f-string
JavaScript
`text ${parameter} text`
template string
C#
$"text {parameter} text"
interpolated string
Groovy
"text {parameter} text"
g-string
Negative test
To see a negative result, we need to replace the 3 with a 4 (because 1+2 is not 4):
Python
# Givennumber1 =1number2 =2# When result = number1 + number2# Thenassert(result ==4)
💻 Code
Then we will see an assertion error, where the result is not 4:
Python3 2024-06-03 on Linux x86_64)----Traceback (most recent call last): File "apps/console-python/", line 9, in <module> assert(result == 4)AssertionError
💻 Terminal result
Better error messages
To give it a better error message, we could add an actual error message:
Python
# Givennumber1 =1number2 =2# When result = number1 + number2# Thenassert(result ==4), f"(actual) {result} != 4 (expected)"
💻 Code
Which gives:
Python3 2024-06-03 on Linux x86_64)----Traceback (most recent call last): File "apps/console-python/", line 9, in <module> assert(result == 4), f"(actual) {result} != 4 (expected)"AssertionError: (actual) 3 != 4 (expected)
💻 Terminal result
and put it inside the text. So, f”{result} =! 4″ becomes: “3 != 4”
You can see, some languages are more fun than others 😂
Why use an assert, when print makes it much easier?
In a simple program like that, we can use a print.
A more complex program may have 20 kinds of outputs, depending on the inputs. It is much easier, faster and more precise to automate all the outputs, instead of testing them manually:
Python
# Imagine we have already a complex program called buyTicket(numberOfZones, travelerType)# here would the tests be:assert (buyTicket(1, "adult")=="2 €")assert (buyTicket(8, "adult")=="16 €")assert (buyTicket(1, "child")=="1 €")assert (buyTicket(8, "child")=="8 €")# etc...
💻 Code
No need to run the buyTicket manually 20 times and checking all the values are correct. A machine can run a 1000 test cases per minute, and we humans can’t. And there is also no need to remember all the results, when the machine can do it for us.
Testing before coding
TDD is not only about testing the code. It’s more about writing down our assumptions that we have in our head.
It’s about getting your idea out of your head and define where we start and where we need to end.
So, often we need to start with the given’s (where we start) and the then’s (where we need to end):
Python
# Given# write your code before writing the when code# When # write your code after writing the given and then code# Then# write your code before writing the when code
💻 Code
The when code is written last, for the given and then are the specification of the wanted system behavior. The when code is the required logic that makes the given and then to connect.
And whenever we want to change the specification, we just need to change the given and the then code. Then we will see the existing when code will fail, and therefore needs to be refactored (updated).
Exercise:
Write the needed when code, so all the asserts pass. Please don’t change the code in the given and then.
Python
# Given (writen before when)number1 =1number2 =2number3 =3number4 =5# When (written after given and then)# your code goes here# Then (written before when)assert(result1 ==3), f"{result} =! 3"assert(result2 ==4), f"{result} =! 4"assert(result3 ==5), f"{result} =! 5"assert(result4 ==6), f"{result} =! 6"assert(result5 ==7), f"{result} =! 7"assert(result6 ==8), f"{result} =! 8"
In the next lessons, we will do some coding exercises.
I would like you to use the assert, so you don’t have to remember the results from all the exercises.
You are welcome to add a print, if you want to. It can be very useful, to use a print, get a view inside to see what the code is doing.
Saving and loading
In the web editor (in case you want to use that to begin with), there is no option to save your code, but what you can do is to copy & paste your code into a notepad for future use.
Then you can always copy & paste it back into a web editor, if you want to use it again. It is also possible to have multiple web editors open, without them interfering with each other.
Exercise:
Add an assert after the print. And make sure the whole code passes.
Python
# Given name and a greeting is defined (you can use any name)name ="Bartek"greeting ="Hello"#When the greeting, a space, and the name is combined into the text.text = greeting +" "+ name# Then the text can be printed as output and be verified by usprint(text)
💻 Code
Hint or Check if your result is correct)
assert(text == “Hello Bartek”)
Then save your code into your favorite notepad, and save it.
Then update the code, to make it fail with an AssertionError.
Extra: if you want more challenge, you can try to add an improved error message.
Spilting:
Sometimes it is not possible to write code in the given, when, then format. This is a sign, that we should split the code into smaller parts (like methods/functions or classes).
A piece of code should often only do a single thing, even though the outputs can be different.
Conclusion
Let’s wrap up our journey into Test-Driven Development (TDD), shall we? 🚀
So, here’s the scoop: Test-Driven Development (TDD) offers a structured approach akin to the scientific method, providing a framework for software development that prioritizes clarity, predictability, and reliability.
Picture this: before we even start writing the actual code, we write tests to describe how we want our software to behave. It’s like drawing a map before going on a big adventure – it helps us know where we’re going!
And here’s where things get really cool: we use something called assertions to check if our code works the way we want it to. It’s like having a super fast and precise robot double-checking our work to make sure everything’s OK! 🤖✅
TDD not only makes testing super fast and easy, but it also helps us understand our coding problems better. It’s like putting together a puzzle – each piece fits together perfectly to create something awesome!
So, as we dive deeper into coding and keep exploring TDD, let’s make assertions our best friend. They’ll help us build super strong, super reliable software that can handle anything life throws our way! 💪🌟