Black Box Software Testing

Spring 2005

Study Guide

Copyright (c) Cem Kaner

Here is your 2005 study guide. All questions on my tests and exams come from this study guide.

I invite you to submit candidate questions for the study guide. I will give students 1 (one) bonus point per excellent question, up to 5 points per student.

The typical midterm includes questions that total between 90 and 110 points, where

Notes on Studying & Answering Test Questions

Because you have plenty of time to work with these questions, I can expect well-organized, well-focused, thoughtful answers. For additional guidance, I suggest my paper on assessment in the testing course http://www.testingeducation.org/articles/assessment_in_the_software_testing_course_wtst_2003_paper.pdf, or these shorter discussions on answering essay questions:

Here are some additional suggestions:


Short Answers

S.1. What is the primary difference between black box and glass box testing? What kinds of bugs are you more likely to find with black box testing? With white box?

S.2. Discuss the assertion that a programmer shouldn't test her own code. Replace this with a more reasonable assertion and explain why it is more reasonable.

S.3. What kinds of bugs might you be likely to miss if you use a reference program as an oracle?

S.4. Consider a program with two loops, controlled by index variables. The first variable increments (by 1 each iteration) from -3 to 20. The second variable increments (by 2 each iteration) from 10 to 20. The program can exit from either loop normally at any value of the loop index. (Ignore the possibility of invalid values of the loop index.)

S.5. A program asks you to enter a password, and then asks you to enter it again. The program compares the two entries and either accepts the password (if they match) or rejects it (if they don’t). You can enter letters or digits.

How many valid entries could you test? (Please show and/or explain your calculations.)

S.6. A program is structured as follows:

Ignore the possibility of invalid values of the index variable or X. How many paths are there through this program? Please show and/or explain your calculations.

S.7. Distinguish between using code coverage to highlight what has not been tested from using code coverage to measure what has been tested. Describe some benefits and some risks of each type of use. (In total, across the two uses, describe three benefits and three risks.)

S.8. Give three different definitions of “software error.” Which do you prefer? Why?

S.9. Use Weinberg's definition of quality. Suppose that the software behaves in a way that you don't consider appropriate. Does it matter whether the behavior conflicts with the specification? Why? Why not?

S.10. Distinguish between customer satisfiers and dissatisfiers. Give two examples of each.

S.11. What is the difference between an error and a failure? Given an example of each.

S.12. What is the difference between severity and priority of a bug? Why would a bug tracking system use both?

S.13. Why are late changes to a product more expensive than early changes?

S.14. Compare, contrast, and give some examples of internal failure costs and external failure costs. What is the most important difference between these two types of failure cost?

S.15. Ostrand & Balcer described the category-partition method for designing tests. Their first three steps are:

Describe and explain these steps.

S.16. In the Print Options dialog in Open Office Writer, you can mark (Yes/No) for inclusion on a document:

(a) Would you do a domain analysis on these (Yes/No) variables? Why or why not?

(b) What benefit(s) (if any) would you gain from such an analysis?

S.17. Here is a Page Style dialog from Open Office

S.18. Compare and contrast scenario testing and beta testing.

S.19. Compare and contrast scenario testing and specification-based testing.

S.20. Why would you use scenario testing instead of domain testing? Why would you use domain testing instead of scenario testing?

S.21. List and briefly describe five different dimensions (different “goodnesses”) of “goodness of tests”.

S.22. When would you use function testing and what types of bugs would you expect to find with this style of testing?

S.23. Advocates of GUI-level regression test automation often recommend creating a large set of function tests. What are some benefits and risks of this?

S.24. Describe two benefits and two risks associated with using test matrices to drive your more repetitive tests.

S.25. What kinds of errors are you likely to miss with specification-based testing?

S.26. What risks are we trying to mitigate with black box regression testing?

S.27. What risks are we trying to mitigate with unit-level regression testing?

S.28. What are the differences between risk-oriented and procedural regression testing?

S.29. Describe three factors that influence automated test maintenance cost.

S.30. Describe three risks of capture-replay automation.

S.31. Under what circumstances might capture-replay automation be effective?

S.32. How does extended random regression work? What kinds of bugs is it good for finding?

S.33. How can it be that you don't increase coverage when using extended random regression testing but you still find bugs?

S.34. What is a quick test? Why do we use them? Give two examples of quick tests.

S.35. What are some of the differences between lightweight and heavyweight software development processes?

S.36. Describe three risks of exploratory testing.

S.37. What is a combination chart? Draw one and explain its elements.

S.38. What is strong combination testing? What is the primary strength of this type of testing? What are two of the main problems with doing this type of testing? What would you do to improve it?

S.39. What is weak combination testing? What is the primary strength of this type of testing? What are two of the main problems with doing this type of testing? What would you do to improve it?

S.40. What are the reporters' questions? Why do we call them context-free?

S.41. What is a configuration test matrix? Draw one and explain its elements.

S.42. What is a decision table? Draw one and explain its elements.

S.43. Suppose that your company decided to script several hundred tests. What types of tests would you write scripts for? Why?

S.44. What do you think is a reasonable ratio of time spent documenting tests to time spent executing tests? Why?

S.45. How long should it take to document a test case? What can you get written in that amount of time? How do you know this?

S.46. What factors drive up the cost of maintenance of test documentation?

S.47. Does detailed test documentation discourage exploratory testing? How? Why?

S.48. What does it mean to do maintenance on test documentation? What types of things are needed and why?

S.40. What benefits do you expect from a test plan? Are there circumstances under which these benefits would not justify the investment in developing the plan?

S.50. What do we mean by "diverse half-measures"? Give some examples.

 

 


Long Answer

L.1. SoftCo makes a word processing program. The program exhibits an interesting behavior. When you save a document that has exactly 32 footnotes, and the total number of characters across all footnotes is 1024, the program deletes the last character in the 32nd footnote.

L.2. While testing a browser, you find a formatting bug. The browser renders single paragraph blockquotes correctly—it indents them and uses the correct typeface. However, if you include two paragraphs inside the <blockquote>…</blockquote> commands, it leaves both of them formatted as normal paragraphs. You have to mark each paragraph individually as blockquote.

Consider the consistency heuristics that we discussed in class. Which three of these look the most promising for building an argument that this is a defect that should be fixed?

For each of the three that you choose:

L.3. The oracle problem is the problem of finding a method that lets you determine whether a program passed or failed a test.

Suppose that you were doing automated testing of page layout (how the document will look like when printed) of a Writer document. Describe three different oracles that you could use or create to determine whether layout-related features were working. For each of these oracles,

L.4. Consider testing a word processing program, such as Open Office Writer.Describe 5 types of coverage that you could measure, and explain a benefit and a potential problem with each. Which one(s) would you actually use and why?

L.5. Some theorists model the defect arrival rate using a Weibull probability distribution. Suppose that a company measures its project progress using such a curve. Describe and explain two of the pressures testers are likely to face early in the testing of the product and two of the pressures they are likely to face near the end of the project.

L.6. Ostrand & Balcer described the category-partition method for designing tests. Their first three steps are:

    1. Analyze
    2. Partition, and
    3. Determine constraints

Apply their method to this function:

I, J, and K are unsigned integers. The program calculates K = I *J. For this question, consider only cases in which you enter integer values into I and J.

Do an equivalence class analysis on the variable K from the point of view of the effects of I and J (jointly) on K. Identify the boundary tests that you would run (the values you would enter into I and J) in your tests of K.

Note: In the exam, I might use K = I / J or K = I + J or
K = IntegerPartOf (SquareRoot (I*J))

L.7. Imagine testing a file name field. For example, go to an Open File dialog, you can enter something into the file name field.

Do a domain testing analysis: List a risk, equivalence classes appropriate to that risk, and best representatives of the equivalence classes.

For each test case (use a best representative), briefly explain why this is a best representative. Keep doing this until you have listed 10 best-representative test cases.

L.8. In EndNote, you can create a database of bibliographic references, which is very useful for writing essays. Here are some notes from the manual:

List the variables of interest and do a domain analysis on them.

L.9. In the Windows version of OpenOffice, you can create a spreadsheet in Calc, then insert it into Writer so that when you edit the spreadsheet file, the changes automatically appear in the spreadsheet object when you reopen the Writer document.

L.10. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office).

List 5 ways that these functions could fail. For each potential type of failure, describe a good test for it, and explain why that is a good test for that type of failure. (NOTE: When you explain why a test is a good test, make reference to some attribute(s) of good tests, and explain why you think it has those attributes. For example, if you think the test is powerful, say so. But don't stop there, explain what about the test justifies your assertion that the test is powerful.)

L.11. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office).

Think in terms of data that you enter into the table . What data is (or could be) associated with tables? List five types of failures that could involve that data. For each type of failure, describe a good test for it and explain why that is a good test for that type of failure. (NOTE: When you explain why a test is a good test, make reference to some attribute(s) of good tests, and explain why you think it has those attributes. For example, if you think the test is powerful, say so. But don't stop there, explain what about the test justifies your assertion that the test is powerful.)

L.12. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office).

Think in terms of persistent data. What persistent data is (or could be) associated with tables? List three types. For each type, list 2 types of failures that could involve that data. For each type of failure, describe a good test for it and explain why that is a good test for that type of failure. (There are 6 failures, and 6 tests, in total). (NOTE: When you explain why a test is a good test, make reference to some attribute(s) of good tests, and explain why you think it has those attributes. For example, if you think the test is powerful, say so. But don't stop there, explain what about the test justifies your assertion that the test is powerful.)

L.13. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office).

Think in terms of compatibility with external software. What compatibility features or issues are (or could be) associated with tables? List three types. For each type, list 2 types of failures that could involve compatibility. For each type of failure, describe a good test for it and explain why that is a good test for that type of failure. (There are 6 failures, and 6 tests, in total). (NOTE: When you explain why a test is a good test, make reference to some attribute(s) of good tests, and explain why you think it has those attributes. For example, if you think the test is powerful, say so. But don't stop there, explain what about the test justifies your assertion that the test is powerful.)

L.14. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office).

Suppose that a critical requirement for this release is scalability of the product. What scalability issues might be present in the table? List three. For each issue, list 2 types of failures that could involve scalability. For each type of failure, describe a good test for it and explain why that is a good test for that type of failure. (There are 6 failures, and 6 tests, in total). (NOTE: When you explain why a test is a good test, make reference to some attribute(s) of good tests, and explain why you think it has those attributes. For example, if you think the test is powerful, say so. But don't stop there, explain what about the test justifies your assertion that the test is powerful.)

L.15. Imagine testing spell checking in Open Office Writer

Describe four examples of each of the following types of attacks that you could make on this feature, and for each one, explain why your example is a good attack of that kind.

(Refer specifically to Whittaker, How to Break Software and use the types of attacks defined in that book. Don’t give me two examples of what is essentially the same attack. In the exam, I will not ask for all 16 examples, but I might ask for 4 examples of one type or two examples of two types, etc.)

L.16. Define a scenario test and describe the characteristics of a good scenario test.

Imagine developing a set of scenario tests for AutoCorrect in OpenOffice Writer.

L.17. Imagine that you were testing how OpenOffice Writer does outline numbering.

L.18. Imagine that you were testing how OpenOffice Writer does outline numbering.

L.19. Suppose that scenario testing is your primary approach to testing. What controls would you put into place to ensure good coverage? Describe at least three and explain why each is useful.

L.20. You are testing the group of functions that let you create and format a table in a word processor (your choice of MS Word or Open Office). Think about the different types of users of word processors. Why would they want to create tables? Describe three different types of users, and two types of tables that each one would want to create. (In total, there are 3 users, 6 tables). Describe a scenario test for one of these tables and explain why it is a good scenario test.

L.21. Suppose that a test group's mission is to achieve its primary information objective. Consider (and list) three different objectives. For each one, how would you focus your testing? How would your testing differ from objective to objective?

L.22. The course notes describe a test technique as a recipe for performing the following tasks:

How does scenario testing guide us in performing each of these tasks?

L.23. The course notes describe a test technique as a recipe for performing the following tasks:

How does risk-based testing guide us in performing each of these tasks?

L.24. The course notes describe a test technique as a recipe for performing the following tasks:

How does domain testing guide us in performing each of these tasks?

L.25. Consider domain testing and specification-based testing. What kinds of bugs are you more likely to find with domain testing than with specification-based testing? What kinds of bugs are you more likely to find with specification-based testing than with domain testing?

L.26. Consider scenario testing and function testing. What kinds of bugs are you more likely to find with scenario testing than with function testing? What kinds of bugs are you more likely to find with function testing than with scenario testing?

L.27. Describe a traceability matrix.

L.28. What is regression testing? What are some benefits and some risks associated with regression testing? Under what circumstances would you use regression tests?

L.29. In lecture, I used a minefield analogy to argue that variable tests are better than repeated tests. Provide five counter-examples, contexts in which we are at least as well off reusing the same old tests.

L.30. Why is it important to design maintainability into automated regression tests? Describe some design (of the test code) choices that will usually make automated regression tests more maintainable.

L.31. A client retains you as a consultant to help them introduce GUI-level test automation into their processes. What questions would you ask them (up to 7) and how would the answers help you formulate recommendations?

L.32. A client retains you as a consultant to help them use a new GUI-level test automation tool that they have bought. They have no programmers in the test group and don't want to hire any. They want to know from you what are the most effective ways that they can use the tool. Make and justify three recommendations (other than "hire programmers to write your automation code" and "don't use this tool"). In your justification, list some of the questions you would have asked to develop those recommendations and the type of answers that would have led you to those recommendations.

L.33. Why do we say that GUI-level regression testing is computer-assisted testing, rather than full test automation? What would you have to add to GUI-level regression to achieve (or almost achieve) full automated testing? How much of this could a company actually achieve? How?

L.34. Contrast developing a GUI-level regression strategy for a computer game that will ship in one release (there won't be a 2.0 version) versus an in-house financial application that is expected to be enhanced many times over a ten-year period.

L.35. Doug Hoffman's description of the square root bug in the MASPAR computer provides a classic example of function equivalence testing. What did he do in this testing, why did he do it, and what strengths and challenges does it highlight about function equivalence testing?

L.36. Compare exploratory and scripted testing. What advantages (name three) does exploration have over creating and following scripts? What advantages (name three) does creating and following scripts have over exploration?

L.37. Describe three different potential missions of a software testing effort. For each one, explain how and why exploratory testing would or would not support that mission.

L.38. A company with a large IT department retains you as a consultant. After watching the testers work and talking with the rest of the development staff, you recommend that the testers work in pairs. An executive challenges you, saying that this looks like you're setting two people to do one person's work. How do you respond? What are some of the benefits of paired testing? Are there problems in realizing those benefits? What?

L.39. We are going to do some configuration testing on the Mozilla Firefox Office browser. We want to test it on

Note: In the exam, I might change the number of operating systems, printers, modem types, or display

L.40. Compare and contrast all-pairs testing and scenario testing. Why would you use one over the other?

L.41. List and explain four claimed strengths of manual scripted tests and four claimed weaknesses.

L.42. Your company decides to outsource test execution. Your senior engineers will write detailed test scripts and the outside test lab's staff will follow the instructions. How well do you expect this to work? Why?

L.43. Suppose that your company decides to write test scripts in order to foster repeatability of the test across testers. Is repeatability worth investing in? Why or why not?

L.44. Imagine that you are an external test lab, and Sun came to you to discuss testing of Open Office Calc. They are considering paying for some testing, but before making a commitment, they need to know what they'll get and how much it will cost.

How will you decide what test documentation to give them?

(Suppose that when you ask them what test documentation they want, they say that they want something appropriate but they are relying on your expertise.)

To decide what to give them, what questions would you ask (up to 7 questions) and for each answer, how would the answer to that question guide you?

L.45. Consider the Open Office word processor and its ability to read and write files of various formats.

L.46. Suppose that Boeing developed a type of fighter jet and a simulator to train pilots to fly it. Suppose that Electronic Arts is developing a simulator game that lets players "fly" this jet. Compare and contrast the test documentation requirements you would consider appropriate for developers of the two different simulators.

L.47. In the slides, we give the advice, "Over time, the objectives of testing should change. Test sympathetically, then aggressively, then increase complexity, then test meticulously." Explain this advice. Why is it (usually) good advice. Give a few examples, to apply it to the testing of Open Office's presentation program. Are there any circumstances under which this would be poor advice?

L.48. Suppose that you find a reproducible failure that doesn’t look very serious.

L.49. Why are late changes to a product more expensive than early changes? How could we make them (late changes) cheaper?


Copyright (c) Cem Kaner 2004

This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

These notes are partially based on research that was supported by NSF Grant EIA-0113539 ITR/SY+PE: "Improving the Education of Software Testers." Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.