Professional Precis Collection

Software Engineering at Google

Preface - Programming Over Time

As the preface, there isn’t a lot of content in this segment. It outlines the authors’ goals to present the ideas of the difference between software engineering and programming, namely that software engineering is over a longer period of time. Part of this is in outlining the key ideas(Time and Change, Scale and Growth, Trade-Offs and Costs) and part is from the presented equation of software engineering as “programming integrated over time”.

This proposes that for a team starting out, it can be good to build a base that anticipates and prepares for a changing and developing project from the beginning. This is part of making a project sustainable, and therefore useful for as long as possible.

What Is Software Engineering?

This chapter not only goes in depth about the differences between software engineering and programming proposed in the preface, but also about the principles of effective software engineering. These principles are much more based in what actions need to be taken by the engineers/project at large, as part of that distinction between programming and software engineering; the latter is much more based on longevity and therefore on the maintenance and upkeep required to create it. Some of these included just implementing standard practices for checks and for taking certain actions/making decisions, and others include warning against allowing problems to build up and advising awareness.

The chapter goes in much greater depth on all these points, of course. From this I think it would be wise to take the key understandings to heart. In particular, I feel that policy creation and making systems that inform how to complete tasks is a particularly useful idea for our current, and presumably next, projects as a team.

How to Work Well In Teams

This chapter begins by highlighting the insecurity inherent to creation; the desire to hide and create alone until your project is perfect in order to minimize — or perhaps avoid entirely — the judgement of others. It then points out the flaws within this process, as it reduces the opportunities for flaw detection and for pooling many different types of expertise to resolve those, as well as improve on the initial design. Many hands make light work, another element of teamwork to aid in project completion. Knowing that isolation doesn’t work is only half the battle, so the chapter then details how to encourage proper teamwork, starting with oneself. It particularly highlights three values; humility, respect, and trust. Through the cultivation of these three traits, a good team—centered perspective and attitude can be created.

While I’ve been in a variety of teams and taken leadership courses, it never hurts to review the basics. It gives me an opportunity to review what has been done so far and where I can improve in maintaining the ideals and strategies of working with other people.

Engineering for Equity

This chapter mostly details the failings of Google regarding allowing for diversity in their products. It then uses this to outline potential solutions, and to note that implementing them is the only way they can actually work. A primary note of this is the continued reminders to begin a project with everyone in mind, and don’t wait until it’s nearly complete to factor everything in.

This is less applicable to the class’ current project, but still something worth keeping in mind in approaching the workforce. Of course, many of us in the class are the diversity spoken of in this section; it mostly seems to be a letter to an audience that is not our class. Still, knowing more is better than knowing less, and all of us can use the key points outlined in the Value Versus Outcomes section.

How to Lead a Team

This chapter discusses how Google splits team leadership into two separate roles — the Engineering Manager and the Tech Lead, although both can be held by one person on a smaller scale. It also goes into why people would be reluctant to be managers, how to overcome that, and some strategies, such as the idea of Servant Leadership, can help someone learn to be a good manager. From here it extends into an assortment of patterns identified in management strategies, from the bad to the good, and how to improve or correct from the negative patterns. This mostly comes back to the points of being respectful, humble, and trusting your teammates.

While none of this is new or groundbreaking information for me, it is a good note for the team as a whole to take into account. In my experience, becoming a better leader also makes a person a better follower, so this can serve everyone, even if they don’t intend to pursue or land in leadership roles.

Leading at Scale

This chapter is outlining some principles for when a leader moves further above the engineering aspect and more into the managing and leading aspects. There are three ‘always’ outlined(as in, things these leaders should always be doing), which are Always be Deciding, Always be Leaving, and Always be Scaling. For the first one, Always be Deciding, the chapter refers to the fact that some problems are continuous and/or don’t have clear solutions, and thus decisions must be made continuously for handling them. Always be Leaving refers to the idea that a leader needs to create a team that can function without them. A subset of this advice is that teams should be assigned to problems(constant and evolving) instead of solutions(temporary), so that they can continue to develop better solutions. The third always, Always be Scaling, is about managing the ever increasing amount of work that will be placed on the leader.

The tips for scaling seem to be the most applicable to those not in a leadership position, but overall there is enough good information here for it to be worth a look over even for those not yet in these positions where it is needed. In fact, to go into a position with this information will provide a good base, and a way to be proactive instead of reactive.

Style Guides and Rules

This chapter opens by giving the definitions used at Google in regards to the distinction between rules and guides. For them, rules are laws, and as such must be enforced, while guides are recommended best practices. Rules and guides are both established to help those using them stay as on track as possible in pursuit of the goal or ideal of the organization. Part of this also is ensuring that the ruleset is not too large or complex for ease of access and understanding, since that would only result in hindering progress. For Google, to achieve their goal of readability, some of their rules make coding more cumbersome for their engineers, but the trade-off is considered worthwhile due to the fact that their code is read more often than it is written. This also is part of striving for consistency, to make lateral movement between projects or segments easier. Part of this is considering the relation to external practices, and how they might influence internal best practices as well. However, it is also made clear that one cannot rely exclusively on consistency. Sometimes, being able to adapt is more important, such as for interacting with code that is not local and therefore not in adherence to Google guides. Plus, as with many things, languages and functionalities change with time and so too must the style rules and guides.

The consistency and commenting have been points of discussion throughout the team’s work on Chasten. Some still do not comply with the formatting for commit messages, which I believe is due to both a lack of reinforcement and a lack of clarity regarding those, showing that even on this more surface level area we lack consistency. Our main binding factor lies in the formatting linters and running black tests, supporting some points made in this chapter regarding automating tests as an effective solution.

Code Review

Code review is, simply put, someone other than the code author(s) looking over the code. This is, in particular, done before that code is added on to a larger project, although it can happen at other stages of development as well. In fact, regular reviews have the potential to contribute to a better overall product than trying to go through it all at once. The review canbe split into three sections: correctness and comprehension, appropriate for the codebase in question, and readable. How many reviewers are necessary is dependent on who is best suited for those reviews; if one person is well enough suited for the completion of all three review segments, then only that one reviewer is necessary. By having a reviewer, a team can better ensure consistency and readability, as established by the three elements to a review. There are also different forms of review at Google for the different types of changes that can be implemented.

For our projects, code review is already a built in process. There are the linters automating formatting, but there is also the requirement to have three others review code before it gets merged into the main body. This has its downsides, mostly in turnaround time, but provides a good base for the basis of good code review proposed here. Perhaps implementing a better standard as to how and when to perform a review could improve the processes currently in place.

Documentation

This chapter goes over the struggles of documentation — both the lack of it and the writing of it. Writing documentation tends to be seen as tedious additional work, since it lacks immediate benefits and only comes into use further along. As such, finding and parsing documentation can be a chore that increases the level of effort a user or future developer has to do in order to be able to work with the software in question. While it is understood that in order to improve readability and the comprehension, there is otherwise little incentive when engineers may feel as though their time is better served making progress on another project. However, this deprives the original project of the insights that they could have provided, and made future work easier. It also can benefit the original engineer(s), if they return to the project after some time. Another aspect of the struggle to write documentation is that engineers may feel as though they lack the necessary writing and language skills, even though in in theory if they can explain their work verbally, they can explain it in documentation. In particular, documentation can be treated like an extension of the comments in the code, and like the code in terms of how it is managed. This also helps with treating the documentation as part of the coding process, instead of simply an optional burden. In addition, one must both consider their audience for the sake of the level writing and for possible feedback through which documentation can be improved. For example, two keys used to balance writing for a wider audience are consistency and clarity. A large part of this also comes from not assuming steps are obvious in another, and splitting them up to ensure they are made as clear as possible.

This is a continuation of previous ideas presented in this book regarding knowledge sharing, and ties into the initially proposed ideas of software engineering being the scaled—up version of programming. This is a case of higher upfront resource costs for the sake of long term benefits, and indeed a point of concern regarding the project Chasten, by my estimate, as the points of documentation do appear a bit limited.

Testing Overview

This chapter covers how robust testing practices can also help with creating a more malleable project, as changes can better be made quicker. Testing is a way to build sharing the insights of one engineer with the team as well, as it provides an opportunity to monitor their own and others’ code. One way that the engineers at Google try to ensure that their tests are as effcient as possible is by categorizing them as small, medium, or large, based off of their scope and scale. There are also points made for keeping tests as clear and contained as possible.

Overall, this chapter details how testing works and has developed at Google. While few of the notes are new and/or applicable information for our projects, it still is something that can be good to know in industry. I have little experience with creating tests as part of something larger, but I do in general enjoy writing little tests, when I have the resources to do so.

Unit Testing

In this chapter, the author first outlines for the reader that unit testing, in Google’s definition, consists of tests with a relatively narrow scope, scope referring to how much code is validated by the test. Since one of the goals of testing is to improve productivity, Google observed some benefits of unit tests, such as being fast and deterministic, easy to write(particularly alongside code), increase test coverage, are easy to understand, and can serve as documentation. Thus, a focus at Google is ensuring that these unit tests are maintainable. In order to do that, the tests must be both sturdy and clear; they shouldn’t break when new code is added without bugs, and they need to be properly notated so their function is apparent to future engineers. Ideally, they don’t need to be changed at all. With such goals in mind, this chapter then goes into methods to meet them. One such way is testing via the public API, since that is the method by which users will be interacting. Since the failures the users see are the most important failures, this is a good way to ensure that is the aspect both being tested and the bugs being resolved before the product is shipped. Another method is to ensure one tests the state, rather than the interactions. This is also testing more considered with the what, rather than the how, both since this makes a test less brittle and is what ultimately matters for the user experience.

When the tests do inevitiably fail, their clarity comes into focus. In order to properly diagnose the failure, and thus the solution, relatively quickly, the test needs to be one that is clearly defined as to its function and purpose. The two factors for ensuring this clarity are completeness and conciseness. A method to do this is testing the behavoirs of the code, rather than making one test to encompass an entire method. This helps with keeping tests small and straightforward, as then each one tests a specific behavoir. It is also recommended that one avoids introducing logic into tests, as that can make the workings of the test less clear. Then, by also making the failure message contain as much possible information regarding the point of failure one can create a clear, concise, and complete test, though steps must be taken to ensure that descriptiveness/clarity are not sacrificed for conciseness.

This chapter is quite helpful in outlining the specifications for writing these unit tests in order to make more robust and clear tests overall. It also helps give a framework with which to look at the code one needs to test as to how to approach the test construction.

Fuzzing Book

Introduction to Software Testing

This chapter uses a square root function as an example for what to test and how to go about testing functions. It covers the assert keyword, as well as basic the basic mathematic properties that go into developing the tests, and how to translate that into a test function/case. The chapter goes on to discuss the Timer module in relation to optimizing tests as well. From there, further optimization is discussed in the form of automatic run-time checks, in order to check the function each time it is run, with any input. This segues into a discussion on how effective this is as an optimization, bringing up the time and system expense at running such checks, and other such limits involved in running tests and attempts to ensure correctness.

While none of this information is new to me personally, it is valuable to reflect on before starting a new project, or even at any point in the process of one. Testing and test cases are what ensures that a program or code segment will work and serve its purpose, as well as allowing experimentation with optimizing the code with the tests to fall back on for checks. A reminder of the tools that aid in doing that is also helpful to make sure one does not try and take the long way around in creating tests, and testing the tests.

Code Coverage

This chapter goes into detail about the coverage of tests. It opens mainly with comparing black-box testing and white-box testing, with black-box testing being that which focuses on specifying what it should be, and white-box testing instead checking the implementation of the code. This means that while black-box testing can be prepared before implementing, it might miss some errors/code that only occurs in implementation. From there it goes on to discuss the option in Python to run a system function(sys.settrace(f)) in order to trace code and automatically test, as a dynamic analysis. From this command the book breaks down the steps to using it to assess coverage as well. The next section outlines how to make this process more efficient using the with keyword and functionality. After covering this topic at length, it also discusses other programming languages, going into automated test coverage for C. It is then detailed that proper coverage does not guarantee error-free code, but using oracle tests and/or fuzzing can help get closer to that ideal.

From this I think it is important to take the understanding of not only how to automate the test creation and coverage checking, but also the distinctions in tests. Knowing which test is most effective, and when something cannot(or should not) be automated is important as well. Mostly, though, I think that it serves as both a reinforcement of the knowledge that you cannot guarantee perfect code even with tests and also all the ways in which you can try, as well as how to use them.

Fuzzing: Breaking Things With Random Inputs

This chapter goes in depth on the history of fuzzing tests, their creation, and their applications both on their own and in conjunction with other forms of testing. A fuzzer, or a fuzzing test, is the generation of a large amount of random inputs to see how well the function or program can hold up with that. It’s a larger scale version of intentionally putting in what one knows are risky inputs to try to ‘break’ the program, to test its robustness. With this established, the chapter goes on to show the creation of increasingly complex implementations of fuzzers. Part of this discussion is going over some common and/or major bugs and design flaws that can create an assortment of both security and general issues. This leads into an explanation on the interaction between other types of checks with fuzzers, and how they can interact to create more robust test cases. This also serves to acknowledge areas in which other checks serve better or as more of a first line of defense. With all of this established, there is then a discussion of a Fuzzing Architecture, showing the practical application of the concepts discussed.

The application of this for projects is quite clear. By implementing a robust test structure, including fuzzers to help identify as manner weak points as possible, one can create more reliable code both for running and for security. The earlier this is in place, the better, as it may catch issues with underlying structures that need to be altered to better suit the goal. This will help remove an assortment of issues that otherwise plague programmers and their software.

Mutation Analysis

This chapter discusses the testing tool of mutation, which creates faults that are valid within the syntax but incorrect for the desired outcome, allowing a way to test the effectiveness of a test. This is why mutations are probable faults; this allows for better overall testing. This came about as a more effective and targeted form of fault injection. The chapter then goes into depth on the technical creation of a mutation inserting program and how that would work. The high level overview of that is mostly that it automates making certain fault injections(mutations) and that when run, it checks how many of these faults are detected by the tests used. Using the value generated from that, the programmer can analyze how successful their tests were. One difficulty with this method is the existence of equivalent mutants, which are mutants that are not fuctionally different enough to be an effective testing mechanism. The recommended method of looking at this is to look at the probability of these equivalents existing.

This chapter provides an excellent counter to the problems of only looking at structural coverage, even if it has its own flaws. If there is a program such as those displayed in the chapter built into the testing infrastructure, it could be a huge boon in helping reduce the unknowns involved in writing new code, and in improving the test cases for it.

Mutation-Based Fuzzing

This chapter of Fuzzing Book expanded excellently on the information presented in previous chapters — it combines the mutation, the fuzzing, and the coverage in both theory and practice, and offers guides on how to implement them through doing so. Specifically, this chapter details how to create functions that create a variety of mutations based on a format or seed input, and then how to implement them all through one class, and then adds in the coverage checking to that. This method is more precise for fuzzing because of the fact that regular fuzzing would not adhere to the necessary format and would not be an effective form of testing. Possible inputs are much more effective for finding the weak points of a function or code segment.

This would be a great thing to implement at the start of a project, or at least early on. For our current class project, it would likely be an inefficient use of resources to implement, but going into our next project, this may be something to assign a group to early on to improve all of our testing.

Fuzzing with Grammars

This chapter begins with a breakdown of how formal languages and universal grammars function in relation to being inputs, in order to progress into grammars, which are the center of the slider from the former to the latter. These grammars are part of how one builds a programming language as well, but here are used for the sake of creating inputs. By defining a grammar, a function can be created to create random inputs within that set structure. This is an excellent way to ensure the test inputs are in a valid syntax, thus allowing more of the code to be tested. This has its limitations, and thus in order to maximize effectiveness the generated inputs can be mutated, for further examination of the constraints and fuctionality of the code.

This is an interesting use of grammars that I had not initially considered when I first learned about them. The further into the Fuzzing Book I read, the more it feels like gradually collecting puzzle pieces for creating a test suite that is robust as possible. Perhaps moving into our new project we can fit some of these pieces together.

Efficient Grammar Fuzzing

This chapter goes over the flaws of the initial grammar fuzzer introduced previously, and then how to fix them. It starts this by bringing up derivation trees, which are a form of representing the grammar of a string or otherwise creation through a tree-like structure, where the top is the sum of all the parts, i.e. the start symbol, and each part is breaking it down further and further until it reaches the smallest possible labels/elements. The chapter then details the implementation of such a thing in generating input/fuzzing, using data structures such as tuples and lists to contain the parts of the grammar and of the tree. This is then completed when the tree can no longer be expanded by a specified number of nodes. This, the chapter sets up to be determined using functions to evaluate the cost of the expansion. This is then supplemented by the addition of evaluating the max, so either one or both can be used depending on the goal, as achieved through making parameters that correspond to each. Overall, the class made in this chapter creates smaller outputs but is more robust and applicable than the simple grammar fuzzer made previously.

It makes sense that a grammar is then taken into a tree for fuzzing, much like it is for constructing a programming language. I had not previously thought much of implementing in the code itself, so it is certainly intruiging to see the nested tuple and list setup within the code as a representation.

Parsing Inputs

This chapter details the reverse of that which was presented in the Efficient Grammar Fuzzing, in that it is taking valid inputs and breaking them into the derivation trees. Having this representation can then be used to create an assortement of mutations. This is achieved through using a parser. By initializing a parser with a grammar, and then giving it the input to parse, one can get the derivation trees with which to build other inputs. The chapter also takes a segment to detail attempting to create an ad hoc parser and its pitfalls before going into using grammars in parsing. Specifically, the chapter uses a tree pruner approach to parsing, which is a more powerful parser and does not distinguish between lexing and parsing. From here, the chapter provides insights into parsers such as Parsing Expression Grammars and the Earley Parser, detailing their strengths and weaknesses.

This in-depth examination of assorted parsers builds on a foundation I received when learning about the construction of programming languages. I’m not sure if there will be many opportunities to use this functionality specifically in our projects, but I do think this knowledge of how parsers work can provide a good base for approaching some other complex concepts and code.

Reducing Failure-Inducing Inputs

This chapter covers how reducing failures can help with pinpointing the source of the issue. Manual input reduction — manually halving the input and entering the halves seperately — quickly grows ineffecient. One of the proposed solutions is delta debugging, which implements a binary search to automate the process used for manual input reduction. This is done by implementing a Reducer class, and then using that to run the reductions. The benefits are that it reduces the cognitive load on the programmer, easier to communicate, and helps with identifying duplicates. However, the worst case complexity can be as high as O(n^2), when the reduction has to go down to the individual characters. Another option is grammar—based reduction, also called GRABR, which relies on grammars to perform the reductions. One of the features of this approach, is that inputs may not necessarily be syntactically valid, which may require reducing the tree first. This is done by replacing subtrees and alternative expansions.

I find the contents of this chapter fascinating, and certainly a bit more difficult for me to begin wrapping my head around than some of the base/previous concepts. As such, I’m having trouble thinking of potential use cases in Chasten, especially as we have quite a limited time frame for completion at this point.

Debugging Book

Introduction to Debugging

This chapter goes over both techniques to avoid and techniques to apply in debugging. For example, three inefficient methods include adding print statements into the code, changing things somewhat randomly until it works, and simply adding a case to pass the test explicitly rather than fulfilling the goal of the test. From these, it can be taken then that the steps to properly debug are first to understand the code, second to fix the problem rather than the symptom, and third to proceed systematically. Part of this is breaking the program or function down into states, to observe at which state the failure begins. It then goes into how one can apply the scientific method to debugging. This can be done by observing the tests in which the failure occurs, creating a hypothesis from that, and developing and testing from that idea, with re—evaluations at each development. The fix can only be implemented with the understanding of causality — the why and how of the failure’s existence — and incorrectness, the why and how of the code is incorrect. This understanding allows one to proceed with creating the necessary fix. After the fix is made, one must then check for further defects, check that existing tests cover defects that may have been found separate from them or after their creation, add an assertion as another check, then commit and close. By following this process, and keeping a record of it, one can ensure they understand the scope of the problem and have a reference both for future similar issues and for how the program or function has been improved. Keeping a log can also help avoid issues found in the randomly changing code until it works, as it means there is a record of what has been done and therefore a reduction of redundancy.

Much of this process I’ve had to learn through intuition, osmosis, and trial and error already, but it is nice to see it written out and clarified. This is less useful this late in the project than it would have been in the beginning, but it certainly doesn’t hurt to go over at any point in the work.

Tracing Execution

Since Python is an interpreted language, tracing is relatively simple, especially with the command sys.settrace, into which a tracing function can be passed. The chapter then goes on to outline creating a function and/or class for tracing, with the ability to customize tracing and to view the line being traced, and to track when a variable is altered. Since the log produced can end up being complicated and hard to read, the chapter also outlines making tracing conditional to ensure that only what is necessary is logged, and how to set up logging to occur based on certain events.

Tracing, and even the sys.settrace method, have already been gone over previously, although building a class for it has not, this chapter seems to be more for designing that then providing any new information, and it’s not a construct likely to be useful for our current project, though it may help, either in implementation or theory, later on.

Asserting Expectations

This chapter covers using assertions, of which built—in assertions offer better information as to which condition and which line the failure occured. Built—in assertions also have the benefit of being optional, so they can be turned off to save computation time/power. It also covers assertions uses outside of test cases, and makes the claim that this is where assertions “show their true power”. This is because assertions can be used to check preconditions, which are checks for ensuring the states/conditions are as they should be for runnning the function which they check. Tests mostly test postconditions, which can also be checked within the function itself. By adding assertions into the functions themselves, the tests are built in to every run(unless, of course, one turns them off). Including both the assertions for preconditions and postconditions would make a complete check, but some functions are resistant to one or the other, and so instead the assertions can only cover a partial check. Another benefit of including assertions in functions is it creates a clear bit of documentation regarding what the function is doing. Additionally, it provides that pinpointing for failures, as remarked on at the beginning of the chapter. Assertions in functions can also serve as data invariants, or invariants, which serve to check given data. These are especially useful with larger and more complex data structures, as they can help check that each piece is moving/working as expected. An important element of this is that assertions should not be required for/part of the production code, due to their nature as optional functionality. It can perform a check, but shouldn’t have an effect outside of that check.

This chapter was a compelling and enlightening one for me, since I had never actually considered using assertions outside of test cases. I am now curious as to how this relates to the testing antipattern of including print statements, as I imagine similar or related possibilities for issues there, especially with some of the notes about the resources assertions consume.

Statistical Debugging

This chapter covers considerations of what causes a failure and the relation to where the error message or the failure takes place. To do so, it outlines a collector class, which is used to record occurences throughout the running of a program. This is then used in the StatisticalDebugger class, which is used to compare successful versus failing runs. This allows the programmer/user to view where the failure occurs to better diagnose both the problem and the solution. This is done in part by implementing a color—coding system to the collection, so that it is more immediately visible where to start in investigating the source of the failure. There are also notes on other models and theories pertainng to debugging.

On a large codebase, this style of debugging has a lot of merits, especially if implemented early on. As such, it may be useful in future projects.