6 posts on Product

A framework for User-Centered Decisions

24 min read Report broken page
TBD: Lacks a conclusion, illustrations, and examples.

I was recently asked to describe the guiding principles I would use to weigh three different solutions to a specific user pain point, with an emphasis on the user/customer. There are typically other factors to consider, such as engineering effort, business goals, etc., but these were out of scope in that particular case.

Since most prioritization frameworks include a user-centered / impact component, the framework discussed here can complement them nicely, by simply replacing some of the factors with its outcome. For example, if using RICE, you can use this framework to calculate R×I, then proceed to multiply by C/E as usual (being mindful of units).

The Three Axes

Utility and Usability (which Nielsen groups under Usefulness) are considered the core pillars of Product-led Growth. However, both Utility and Usability are short-term metrics and do not consider the bigger picture, so only using them as a compass could result in short-sightedness. I think there is also a third axis, which I call Evolution, and refers to how well a feature fits into the bigger picture, by examining how it relates to the product’s past and (especially) future.

Alt text

  1. Utility (aka Impact): How many use cases and user pain points does it address, how well, and how prominent are they?
  2. Usability: How easy is it to use? Evaluating usability at the idea stage is tricky, as overall usability will vary depending on how each idea is implemented, and there is often a lot of wiggle room within the same idea. At this stage, we are only concerned with aspects of usability inherent in the idea itself.
  3. Evolution: How does it relate to features that we have shipped in the past and features we may ship in the future? Being mindful of this prevents feature creep and ensures the mental models exposed via the UI remain coherent.

These are not entirely independent, there are complex interplays between them:

  • Evolution affects Usability: Features that fit poorly into the product’s past and future will create later usability issues. However, treating it as a separate factor helps us catch these issues much earler, and at the right conceptual level.
  • Utility and Usability can often be at odds: the more powerful a feature is, the more challenging it is to make it usable.

Now let’s discuss each axis in more detail.

Utility

Utility measures the value proposition of a feature for users. It can be further broken down to:

  • Raising the ceiling: What becomes possible? Does it enable any use cases for which there is no workaround?
  • Lowering the floor: What becomes easier? Does it provide a better way to do something for which there is already a workaround? How big is the delta?
  • Widening the walls: Does it serve an ignored audience or market? Does it broaden the set of use cases served by the product?
  • Use Case Significance: How important are the use cases addressed?

While this applies more broadly, it is particularly relevant and top priority for creative tools.

In evaluating the overall Utility of an idea, it can often be helpful to list primary and secondary use cases separately, and evaluate Significance for them separately.

Primary & Secondary use cases

Primary use cases are those for which a solution is optimal (or close), and have often been the driving use cases behind its design. This is to contrast with secondary use cases, for which a solution is a workaround. Another way to frame this is friction: How much friction does the solution involve for each use case? For primary use cases, that should be close to 0, whereas for secondary use cases it will be higher.

A good design will ideally have a healthy amount of both. Lack of secondary use cases could hint that the feature may be overly tailored to specific use cases (overfitting).

The north star goal should obviously be to address all use cases head-on, with zero friction. But since resources are finite, enabling workarounds buys us time. There is far less pressure to solve a use case for which there is a workaround, than one that is not possible at all. The latter contributes to churn far more directly.

It is not unheard of to ship a feature with a low number of primary use cases, simply because it has a high number of secondary use cases, and will buy us time to work on better solutions for them. In these cases, Evolution is even more important: when we later have addressed all these use cases head-on, does this feature still serve a purpose?

Use Case Significance

This is a rough measure of how important the features addressed are. This needs to be evaluated holistically: an incremental improvement for a common interaction is far more impactful than a substantial improvement for a niche use case.

Some ways to reason about it may be:

  • Frequency: How frequently do these use cases come up in a single user journey?
  • Reach: What percentage of users do they affect?
  • Criticality: How much do they matter to users? Are they a nice-to-have or a dealbreaker?
  • Vision: How do the use cases relate to the software’s high level goals?

Vision may at first seem more related to the business than the user. However, when software design loses sight of its vision, the result can be a confusing, cluttered user experience that doesn’t cater to any use case well.

Usability

There are many ways to break usability down into independent, quantifiable dimensions. I generally go with a tweaked version of the one I first learned at MIT’s UI Design & Implementation course I took in 2016 (and then taught in 2018 and replaced in 2020 😅), bringing it one step closer to the original Nielsen dimensions by re-adding Satisfaction:

  1. Learnability: How easy is it for users to understand?
  2. Efficiency: Once learned, is it fast to use?
  3. Safety (aka Errors): Are errors few and recoverable?
  4. Satisfaction: How pleasant is it to use?

Some examples of usability considerations and how they relate to these dimensions:

Learnability

  • Compatibility: Does it re-use existing concepts or introduce new ones?
  • Internal Consistency: How consistent is it with the way the rest of the product works?
  • External Consistency: How consistent is it with the environment (other products, related domains, etc.)?
  • Memorability: When users return to the design after a period of not using it, how easily can they reestablish proficiency?

Efficiency

  • Speed: How many steps does it take to accomplish a task and how long does each step take?
  • Cognitive Load: How much mental effort does it require?
  • Physical Load: How much physical effort does it require?

Safety

  • Error-proneness: How hard is it for users to make mistakes?
  • Error severity: How severe are the consequences of mistakes?
  • Recoverability: How easy is it to recover from mistakes?

Satisfaction

  • Aesthetics: How visually pleasing is it?
  • Ergonomics: How comfortable is it to use?
  • Enjoyment: How fun is it to use?

Satisfaction is a bit of an oddball. First, it has limited applicability to certain types of UIs, e.g. non-interactive text-based UIs (programming languages, APIs, etc.). Even where it applies, it can be harder to quantify. But most importantly, when deciding between ideas there is rarely enough signal to gauge satisfaction. If it’s not helpful for your use case, just leave it out.

Each idea will rarely have universally worse or better usability than another. More commonly, it will be better in some dimensions and worse in others. To evaluate these tradeoffs, we need to understand the situation and the user.

The situation

“Situation” here refers to the use case plus its context.

The more repetitive or common the task, the higher the importance of Efficiency. For example text entry is an area where efficiency needs to be optimized down to individual keystrokes or minute pointing movements. On the other end of the spectrum, for highly infrequent tasks, users don’t have time to develop transferable knolwedge across uses and thus Learnability is very important (e.g. tax software, visa applications). Last, the more there is at stake, the more important Safety becomes. Some examples of cases where Safety is top priority would be missile launches, airplane navigation, healthcare software on a macro scale, or privacy, data integrity, finances on a micro scale.

There is granularity here as well. For example, a visa application is used infrequently enough that learnability matters far more than efficiency for the product in general. However, if it includes a question where it expects the user to enter their last 20 international trips, efficiency for trip entry is important.

Sometimes, two factors may genuinely be equally important. Consider a stock trading program used on a daily basis by expert traders. Lost seconds translate to lost dollars, but mistakes also translate to lost dollars. Is Efficiency or Safety more important?

Note that there are also interplays between different dimensions: the more effort a task involves (efficiency), the more high stakes a mistake is perceived to be (safety). You have likely experienced this: a lengthy form losing your answers feels a lot more frustrating than having to re-enter your email in a login form.

The user

As a general rule of thumb, novices need learnability whereas for experts other dimensions of usability are more important. But who is an expert? Expert in what?

Application expertise is orthogonal to domain expertise. Tax software for accountants needs good learnability in terms of application features, but can assume familiarity with tax concepts (but not necessarily recall). Conversely, tax software for regular taxpayers needs both: as software that is typically only used once a year, learnability in terms of application features is top priority. But abstracting and simplifying tax concepts is also important, as most users are not very proficient in them.

Generally speaking, the more we can rely on training, the less important learnability becomes. This is why airplane cockpits are so complex: pilots have spent years of training learning to use these UIs, so efficiency and safety are prioritized instead (or at least should be — sadly that is not always the case).

That said, there is often an opportunity for disruption here, by taking a product that has the potential to bring value to many but currently requires lengthy training, and creating one that requires little to none. Creator tools are prime candidates for this, with no-code/low-code tools being a flagship example right now. However, almost every mainstream technology went through this kind of democratization at some point: computers, cameras, photo editing, video production, etc.

This distinction does not only apply to the product as a whole, but also individual product areas. For example, an onboarding flow needs to prioritize learnability regardless of the priorities of the rest of the product.

Evolution

Evolution is a bigger picture measure of how well a proposed feature fits into the product’s past, present and future, with an emphasis on the latter, since relationship to the past and present is also the Internal Consistency component of Learnability.

When evaluating compatibility with potential future evolution, it’s important to not hold back. Ten years down the line, when today’s implementation constraints, technology limitations, or resource limits are no more, what would we ship and how does this feature relate to it? Does it have a place in that future, is it entirely unnecessary, or — worse — does it actively conflict with it?

This is to avoid feature creep by ensuring that features are not designed ad hoc, but they contribute towards a coherent conceptual design.

The most common way for a feature to connect to the product’s past, present, and future is by being a milestone across a certain axis of progress:

  • Level of abstraction (See Layering):
    • Is it a shortcut to a present or future lower level primitive?
    • Is it a lower level primitive that explains existing functionality?
  • Power: Is it a less powerful version of a future feature?
  • Granularity: Is it a less granular version of a future feature?

If we have a north star UI, part of this is to consider whether a proposed feature is compatible with it or actively diverges.

A feature could also be entirely orthogonal to all existing features and still be a net win wrt Evolution. For example, when it helps us streamline UX by allowing us to later remove another feature that has been problematic.

Weighing tradeoffs

While all three are very important, they are not equally important. In broad strokes, usually, Utility > Usability > Evolution. Here’s why:

  • Utility > Usability: If a product does not provide value, people leave, even if it provides a fantastic user experience for the few and/or niche use cases it actually serves.
  • Usability > Evolution, since Evolution is a long-term / more speculative concern, whereas Usability a more immediate / higher confidence one.

Depending on the product and the environment however, this trend could be reversed:

  • Competition: If a product is competing in a space where use cases are already covered very well, but by products with poor usability, Usability becomes more important. In fact, many successful products were actually usability innovations: The Web, Dropbox, the iPhone, Zoom, and many others.
  • Mutability: Change is always hard, but for some products it’s a lot harder, making a solid Evolution path more important. Web technologies are an extreme example: it is almost impossible to remove or change anything, ever, as there are billions of uses in the wild, no way to migrate them, and no control over them. Instead, changes have to be designed as new technologies or additions to existing ones.
  • Complexity: The more complexity increases, the more important it becomes to keep further increase at bay, so Evolution becomes more important.

Ok, now make the darn decision already!

So far we’ve discussed various tradeoffs, so it may be unclear how to use this as a framework to make actual decisions.

Decision-making itself also involves tradeoffs: adding structure makes it easier to decide, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure if the previous step failed to provide enough clarity. For simple, uncontroversial decisions, just discussing the three axes can be sufficient, and the cost-benefit of more structure is not worth it. But for more complex higher stakes decisions, a more structured approach can pay off.

Let’s consider the goals for any scoring framework:

  1. Compare and contrast: Make an informed decision between alternatives without being lost in the complexity of their tradeoffs.
  2. Drive consensus: It is often easier for a team to agree on a rating or weight for an individual factor, than the much bigger decision of which option to go with.
  3. Communicate: Provide a way to communicate the decision to stakeholders, so they can understand the rationale behind it.

Calculating things precisely (e.g. use case coverage, significance, reach etc.) is rarely necessary for any of them, and thus not a good use of time. Remember that the only purpose of scores is to help us compare alternatives. They have no meaning outside of that context. In the spirit of an iterative approach, start with a simple 1-5 score for each factor, and only add more granularity and/or precision if that does not suffice for converging to a decision.

We can use three tables, one for each factor, with a row for each idea. Then the columns are:

Utility
  • Primary use cases
  • Secondary use cases
  • Utility Score (1-5)
Usability
  • Learnability
  • Efficiency
  • Safety
  • Usability Score (1-5)
Evolution
  • Past & Present
  • Future
  • Evolution Score (1-5)

We fill in the freeform columns first, which should then give us a pretty clear picture of the score for each factor.

Finally, using the 3:2:1 ratio mentioned above, the overall score would be:

Overall_Score=3Utility_Score+2Usability_Score+1Evolution_Score3+2+1

Template: User-Centered Decision Worksheet

I have set up a Coda template for this, which you can copy and fill in with your own data.

Why Coda instead of something like Google Docs or Google Sheets?

  • I don’t have to repeat each idea in the multiple tables, I can set them up as views and they update automatically
  • Rich text (lists, etc) within table cells make it easier to brainstorm
  • One-click rating widgets for scores (great when iterating)
  • I can output the overall score for each feature with a formula, and it updates automatically. No need to clumsily copy-paste it across cells either, I can just define it once for the whole column. I can even use controls for the weights that are outside the table entirely.
  • This may be subjective, but I find Coda docs more well designed than any alternative I’ve tried.

Screenshot of Coda tooltip

As a bonus, I can then even @-mention each feature in the rest of the doc, and hovering over it shows a tooltip with all its metadata!
Product, Product Design, User Centered Design, Decision Making, Prioritization
Edit post on GitHub

Tradeoff scorecard: The ultimate decision-making framework

5 min read Report broken page

Every decision we make involves weighing tradeoffs, whether that is done consciously or not. From evaluating whether an extra bedroom is worth $800 extra in rent, whether being able to sleep lying down during a long flight is worth the $500 upgrade cost, to whether you should take a pay cut for that dream job.

For complex high-stakes decisions involving many competing tradeoffs, trying to decide with your gut can be paralyzing. The complex tradeoffs that come up when designing products [1] fall in that category so frequently that analytical decision-making skills are considered one of the most important skills a product manager can have. I would argue it’s a bit broader: analytical decision-making is one of the most useful skills a human can have.

Structured decision-making is a passion of mine (perhaps as a coping mechanism for my proneness to analysis paralysis). In fact, one of the very first demos of Mavo (the novice programming language I designed at MIT) was a decision-making tool for weighed pros & cons. It was even one of the two apps our usability study participants were asked to build during the first Mavo study. I do not only use the techniques described here for work-related decisions, but for any decision that involves complex tradeoffs (often to the amusement of my friends and spouse).

Screenshot of the Decisions Mavo app

The Decisions Mavo app, one of the first Mavo demos, is a simple decision-making tool for weighed pros & cons.

Before going any further, it is important to note a big caveat. Decision-making itself also involves tradeoffs: adding structure makes decisions easier, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure only if the previous step failed to provide clarity. Each step progressively builds on the previous one, minimizing superfluous effort.

For very simple, uncontroversial decisions, just discussing or thinking about pros and cons can be sufficient, and the cost-benefit of more structure is not worth it. Explicitly listing pros and cons is probably the most common method, and works well when consensus is within reach and the decision is of moderate complexity. However, since not all pros and cons are equivalent, this delegates the weighing to your gut. For more complex or controversial decisions, there is value in spending the time to also make the weighing more structured.

The tradeoff scorecard

What is a decision matrix?

A decision matrix, also known as a scorecard, is a table with options as rows and criteria as columns, with a column in the end that calculates a score for each option based on the criteria. These are useful both for selection, as well as prioritization, where the score is used for ranking options. In selection use cases, the columns can be specific to the problem at hand, or predefined based on certain principles or factors, or a mix of both. Prioritization tends to use predefined columns to ensure consistency. There is a number of frameworks out there about what these columns should be and how to calculate the score, with RICE (Reach × Impact × Confidence / Effort) likely being the most popular.

The Tradeoff Scorecard is not a prioritization framework, but a decision-making framework for choosing among several options.

Qualitative vs. quantitative tradeoffs

Typically, tradeoffs fall in two categories:

  • Qualitative: Each option either includes the tradeoff or it doesn’t. Thing of them as tags that you can add or remove to each option.
  • Quantitative: The tradeoff is associated with a value (e.g. price, effort, number of clicks, etc.)

Not all tradeoffs are all equal. Even for qualitative tradeoffs, some are more important than others, and the differences can be quite vast. Some strengths may be huge advantages, while others minor nice-to-haves. Similarly, some weaknesses can be dealbreakers, while others minor issues.

We can model this by assigning a weight to each tradeoff (typically a 1-5 or 1-10 integer). But if quantitative tradeoffs have a weight, doesn’t that make them quantitative? The difference is that the weight applies to the tradeoff itself, and is applied the same to each each option, whereas the value of quantitative tradeoffs quantifies the relationship between tradeoff and option and thus, is different for each. Note that quantitative tradeoffs also have a weight, since they also don’t all matter the same.

In diagrammatic form, it looks a bit like this: Simplified UML-like diagram showing that each tradeoff has a weight, but the relationship between option and quantitative tradeoff also has a value

These categories are not set in stone. It is quite common for qualitative tradeoffs to become quantitative down the line as we realize we need more granularity. For example, you may start with “Poor discoverability” as a qualitative tradeoff, then realize that there is enough variance across options that you instead need a quantitative “Discoverability” factor with a 1-5 rating. The opposite is more rare, but it’s not unheard of to realize that a certain factor does not have enough variance to be worth a quantitative tradeoff and instead should be modeled as 1-2 qualitative tradeoffs.

The overall score of each option is the sum of the scores of each individual tradeoff for that option. The score of each tradeoff is often simply its weight multiplied by its value, using 1/-1 as the value of qualitative tradeoffs (pro = 1, con = -1).

While qualitative tradeoffs are either pros or cons, quantitative tradeoffs may not be universally positive or negative. For example, consider price: a low price is a strength, but a high price is a weakness. Similarly, effort is a strength when low, but a weakness when high. Calculating a score for these types of tradeoffs can be a bit more involved:

  • For ratings, we can subtract the midpoint and use that as the value. E.g. by subtracting 3 from a 1-5 rating we get value from -2 to 2. Adjust accordingly if you don’t want the switch to happen in the middle.
  • For less constrained values, such as prices, we can use the value’s percentile instead of the raw number.

Explicit vs implicit tradeoffs

When listing pros and cons across many choices, have you noticed that there is a lot of repetition? First, several options share the same pros and cons, which is expected, since they are alternative solutions to the same problem. But also because pros and cons come in pairs. Each strength has a complementary weakness (which is the absence of that strength), and vice versa.

For example, if one UI option involves a jarring UI shift (a bad thing), the presence of this is a weakness, but its absence is a strength! In other words, each qualitative tradeoff is present on all options, either as a strength or as a weakness. The decision of whether to pick the strength or the weakness as the primary framing for each tradeoff is often based on storytelling and/or minimizing effort (which one is more common?). A good rule of thumb is to try to avoid negatives (e.g. instead of listing “no jarring UI shift” as a pro, list “jarring UI shift” as as con).

It may seem strange to view it this way, but imagine you were trying to compare and contrast five different ideas, three of which involved a jarring UI shift. You would probably list “no jarring UI shifts” as a pro for the other two, right?

This realization helps cut the amount of work needed in half: we simply assume that for any tradeoff not explicitly listed, its opposite is implicitly listed.

Putting it all together

Your choice of tool can make a big difference to how easy this process is. In theory, we could model all tradeoffs as a classic decision matrix, with a column for each tradeoff. Quantitative tradeoffs would correspond to numeric columns, while qualitative tradeoffs would correspond to boolean columns (e.g. checkboxes).

Indeed, if all we have is a grid-based tool (e.g. spreadsheets), we may be stuck doing exactly that. It does have the advantage that it makes it trivial to convert a qualitative tradeoff to a quantitative one, but it can be very unwieldy to work with.

If our tool of choice supports lists within cells, we can do better. These boolean columns can be combined into one column as a list of all relevant tradeoffs. Then, a separate table can be used to define weights for each tradeoff (and any other metadata, e.g. optional notes).

I currently use Coda for these tradeoff scorecards. While not perfect, it does support lists in cells, and has a few other features that make working with tradeoff scorecards easier:

  • Thanks to its relation concept, the list of tradeoffs can be actually linked to their definitions. This means that hovering over each tradeoff displys a popup with its metadata, and that I can add tradeoffs by selecting them from a popup menu.
  • Conditional formatting allows me to color-code tradeoffs based on their type (strength/weakness or pro/con) and weight (lighter color for smaller impact).
  • Its formula language allows me to show and list the implicit tradeoffs for each option (though there is no way to have them be color-coded too).

There are also limitations however:

  • While I can apply conditional formatting to color-code the opposite of each tradeoff, I cannot display implicit tradeoffs as nice color-coded chips, in the same way as explicit tradeoffs, since relations can only display the primary column.
  • Weights for quantitative tradeoffs have to be baked in the formula (there are some ways to make them editable, but )

  1. I use product in the general sense of a functional entity designed to fulfill a specific set of needs or solve particular problems for its users. This does not only include commercial software, but also things like web technologies, open source libraries, and even physical products. ↩︎


Evolution: The missing component of Product-Led Growth

6 min read Report broken page
TBD: Lacks a conclusion, illustrations, and examples.

What is Product-Led Growth?

In the last few years, Product-Led Growth has seen a meteoric rise in popularity. The idea is simple: instead of relying on sales and marketing to acquire users, you build a product that sells itself. As a usability advocate, this makes me giddy: Prioritizing user experience is now a business strategy, with senior leadership buy-in!

NN/G considers Utility and Usability the core components of Product-led Growth, which Nielsen groups under a single term: Usefulness. Utility refers to how many use cases are addressed, how well, and how significant these use cases are. If you thought that sounds very much like the RI in RICE, you’d be right, they are indeed roughly the same concept, from a different perspective. Usability, as you probably know, refers to how easy the product is to use, and can be further broken down into individual components, such as Learnability, Efficiency, Safety, and Satisfaction.

Indeed, maximizing Utility and Usability are crucial for creating products that add value. However, both suffer from the same flaw: they are short-term metrics, and do not consider the bigger picture over time. It’s like playing chess while only thinking about the next move. You could be making excellent choices on each turn and still lose the game. Great Utility and Usability alone do not prevent feature creep. We can still end up with a convoluted user experience that lacks a coherent conceptual model; all it takes is enough time.

Therefore, I think there is also a third component, which I call Evolution. Evolution refers to how well a feature fits into the bigger picture of a product, by examining how it relates to its past, present and future (or, more accurately, its various possible futures). By prioritizing features higher when they are part of a trajectory or greater plan and deprioritizing those that are designed ad hoc we can limit complexity, avoid feature creep, and ensure we are moving towards a coherent conceptual design.

Introducing entirely new concepts is not an antipattern by any means, that’s how products evolve! However, it should be done with caution, and the bar to justify such features should be much higher.

The three axes are not entirely independent. Evolution will absolutely eventually affect Usability. The whole point of treating Evolution as a separate axis is that this allows us to catch these issues early and prevent them in the making. By the time conceptual design issues create usability problems, it’s often too late. The changes required to fix the underlying design are a lot more substantial and costly.

The weight of Evolution

The importance of Evolution was really drilled into me while designing web technologies, i.e. the technologies implemented in browsers that web developers use to develop websites and web applications. We do not have a name for it, but the consideration is very high priority when designing any feature for the Web.

In general, Utility and Usability matter more than Evolution. Just like in chess, the next move is far more important than any subsequent move. The argument this post is making is that we should look further than the current roadmap, not that we should stop looking at what’s right in front of us. However, there are some cases where Evolution may become equally important as the other two, or even more.

Low mutability is one such case. Change is always hard, but for some products it’s a lot harder. Web technologies are an extreme example, where you can never remove or change anything. There are billions of uses in the wild, that you have no control over, and no way to migrate users. You cannot risk breaking the Web. Instead, changes must be designed as either additions to existing technologies, or (if substantial enough) as entirely new technologies. The best you can hope for is that if you deprecate the old technology, and you heavily promote the new one, over many years usage of the old technology will drop below the usage threshold that allows considering removal (< 0.02%!). I have often said that web standards work is “product work on hard mode”, and this is one of the reasons. If you do product work, pause for a moment and consider this: How much harder would shipping be if you knew you could never remove or change anything?

Another case is high complexity. Many things that are complex today began as simple things. The cost of adding features without validating their Evolution story is increasing complexity. To some degree, complexity is the fate of every successful product, but being deliberate about adding features can curb the rate of increase. Evolution tends to become higher priority as a product matures. This is artificial: keeping complexity at bay is just as important in the beginning, if not more. However, it is often easier to see in retrospect, after we’ve already felt the pain of increasing complexity.

The value of a North Star UI

In evaluating Evolution for a feature, it’s useful to have alignment on what our “North Star UI(s)” might be.

A North Star UI is the ideal UI for addressing a set of use cases and pain points in a perfect world where we have infinite resources and no practical constraints (implementation difficulty, performance, backwards compatibility, etc.). Sure, many problems are genuinely so hard that even without constraints, the ideal solution is still unknown. However, there many cases where we know exactly what the perfect solution would be, but it’s simply not feasible, so we need to keep looking.

In these cases, it’s useful to document this “North Star UI” and ensure there is consensus around it. You can even do usability testing (using wireframes or prototypes) to validate it.

Why would we do this for something that’s not feasible? First, it can still be useful as a guide to steer us in the right direction. Even if you can’t get all the way there, maybe you can close enough that the remaining distance won’t matter. And in the process, you may find that the closer you get, the more feasible it becomes.

Second, it ensures team alignment, which is essential when trying to decide what compromises to make. How can we reach consensus on the right tradeoffs if we are not even aligned on what the solution would be if we didn’t have to make any compromises?

Third, it builds team momentum. Doing usability testing on a prototype can do wonders for getting people on board who may have previously been skeptical. I would strongly advise to include engineers in this process, as engineering momentum can literally make the difference between what is possible and what is not.

Last, I have often seen “unimplementable” solutions become implementable later on, due to changes in internal or external factors, or simply because a brilliant engineer had a brilliant idea that made the impossible, possible. In my 11 years of designing web technologies, I have seen this happen so many times, I now interpret “cannot be done” as “really hard — right now”.

Mini Case study 1: CSS Nesting Syntax

My favorite example, and something I’m proud to have personally helped drive is the current CSS Nesting syntax, now shipped in every browser. We had plenty of signal for what the optimal syntax was for users (North Star UI), but it had been vetoed by engineering across all major browsers due to prohibitive performance, so we had to design around certain parsing constraints. The original design was quite verbose, actively conflicted with the NSUI syntax, and had poor compatibility with another related feature (@scope). Instead of completely diverging, I proposed a syntax that was a subset of our NSUI, just more explicit in some (common) cases. Originally discussed as “Lea’s proposal”, it was later named “Non-letter start proposal” but became known as Option 3 from its position among the five options considered. After some intense weighing of tradeoffs and several user polls and surveys, the WG resolved to adopt that syntax.

Once we got consensus on that, I started trying to get people on board to explore ways (and brainstorm potential algorithms) to bridge the gap. A few other WG members joined me, with my co-TAG member Peter Linss perhaps being most vocal. We initially faced a lot of resistance from browser engineers, until eventually a couple Chrome engineers closed on a way to implement the north star syntax 🎉, and as they say, the rest is history.

It was not easy to get there, and required weighing Evolution as a factor. There were diverging proposals that in some ways had better syntax than that intermediate milestone. If we only looked at the next move, if we had only used Utility and Usability to guide us, we would have made a suboptimal long-term decision.

Evaluating Evolution

To evaluate Utility, we can look at the use cases a feature addresses, and how significant they are. Evaluating Usability is also a matter of evaluating its individual components, such as Learnability, Efficiency, Safety, and Satisfaction. This can be done via usability testing, or heuristic evaluation, and ideally both. But how do we evaluate Evolution for a proposed feature?

How well it fits with the product’s past and present overlaps with Usabilty (through Internal Consistency, a component of Learnability), but is also important to consider.

When evaluating how well a feature fits into the product’s future, we can use the north star UI if we have one, as well as other related features that could plausibly be shipped in the future (e.g. have already been discussed, or are natural evolutions of existing features).

Does this feature connect to the product’s past, present, and future across a certain axis of progress? For example:

  • Level of abstraction (See Layering):
    • Is it a shortcut to a present or future lower level primitive?
    • Is it a lower level primitive that explains existing functionality?
  • Power: Is it a less powerful version of a future feature?
  • Granularity: Is it a less granular version of a future feature?

Other considerations:

  • Opportunity cost: What does introducing this feature prevent us from doing in the future?
  • Simplification: What does it allow us to remove?
TBD: Lacks a conclusion, illustrations, and examples.

What is a North Star UI and how can it help product design?

9 min read Report broken page

“Oh we can’t possibly do that, it would be way too much work to implement!”

Raise your hand if you’ve ever heard (or worse, said) this during a product design brainstorming session. In my experience, this argument is where a lot of good product design goes to die.

No, I’m not suggesting you should drain engineering resources chasing the perfect UI! Yes, in the end it will all boil down to Impact/Effort [1]. But bringing ephemeral constraints such as implementation difficulty up too early is counterproductive.

By ephemeral I’m referring to constraints that are not intrinsic to the problem at hand (these are requirements!), but those that have to do with the current state of technology, the team, the company, or the broader environment.

For example, “This interaction will be very common and needs to be as frictionless and efficient as possible” is a requirement, as it is intrinsic to the use case. Ephemeral constraints are things like:

  • Engineering resources
  • Performance
  • Technical feasibility (to some extent)
  • Backwards compatibility

A tool I keep coming back to is what I call a “North Star UI” . A North Star UI is the ideal UI for addressing a set of pain points and use cases in a perfect world with no ephemeral constraints. It answers the question “What would we ship if both we and our users had infinite resources, and all our users were new?”.

I have often mentioned this concept in discussions, and it seemed to be generally understood. However, a quick google search revealed that outside of this blog, there are only a couple mentions of the term across the entire Web, and the only actual definition seems to be a callout in my Eigensolutions essay. That needed to be fixed — if I’ve found this concept so useful, it’s highly likely that others would too!

For the uninitiated, designing a solution that ignores essential constraints seems like a pointless academic exercise. “Why would we spend precious resources on something that’s not feasible? We should be pragmatic, not chase pipe dreams!” I often hear. And yet, perhaps counterintuitively, a solid NSUI can provide so many benefits that over time it actually reduces the amount of resources spent on product design.

What benefits? Let’s dive in.

1. Simplifies problem solving

A common problem-solving strategy in every domain, is to break down a complex problem into smaller, more manageable components and solving them separately. Product design is no different. The concept of a North Star UI breaks down product design tasks into three more manageable components:

  1. North Star UI (NSUI): What is the ideal solution?
  2. Ephemeral constraints: What prevents us from getting there?
  3. Compromises: How close can we reasonably get given the timeframe we have?

These subproblems do not always have the same level of difficulty. Sometimes the NSUI is obvious, and the big product design challenge is navigating the compromises. Other times, the big challenge is figuring out what the NSUI should be and once this is clear, it turns out it’s also perfectly feasible. And most of the time, both are hard, so solving them separately can be a huge help.

2. Facilitates team alignment and helps build consensus

This is a pattern I’ve seen very frequently, across many different teams: disagreements about the NSUI will often masquerade as disagreements about practical constraints, so people will waste time and energy debating the wrong issue.

Here is a story that may sound familiar: Bob will say “X is way too much work, it’s not worth doing”, but what are they actually thinking is “X is a bad idea, so any nontrivial amount of work towards it is a waste”, while Alice thinks that X is an elegant solution that would create an incredible user experience, and is worth a somewhat higher implementation cost. Instead of spending time figuring out whether X is a good idea in itself, they spend their time debating how much work it is and how that could be simplified, but fail to reach consensus because that is not actually the root issue. (Any similarity to real persons, living or dead, is purely coincidental.) 😅

The thing is, when the NSUI is not documented, it still exists, but everyone has their own version. It is important to answer these questions in order, and reach consensus on what the North Star UI is before moving on. We need to be aware of what is an actual design decision and what is a compromise driven by practical constraints.

By articulating these separately, they can also be debated separately, so that when we are at the stage of evaluating compromises, we are all on the same page about what we are trading off and how much it’s worth. How can you do a cost-benefit analysis, without knowing both the cost and the benefit?

NSUIs can even be user tested, using wireframes or prototypes, which can be particularly useful when there are vastly different perspectives within a team about what the NSUI is, or when the problem is so novel that every potential solution is on shaky ground. Even the best product intuition can be wrong, and there is no point in evaluating compromises if it turns out that even the “perfect” solution is not actually a good one.

3. Paves the way for getting there (someday)

Just like the mythical North Star, a NSUI can serve as a guide to steer us in the right direction. Simply articulating what the NSUI is can in itself make it more feasible. No, it’s not magic, just human psychology.

First, once we have a NSUI, we can use it to evaluate proposed solutions: How do they relate to a future where the NSUI is implemented? Are they a milestone along that path, or do they actively prevent us from ever getting there?

Prioritizing solutions that are milestones that get us closer to the NSUI can be a powerful tool in building momentum. Once we’re partway there, it naturally begs the question: how much closer can we get? it is much easier to convince people to move a little further along on the path they are already on, than to move to a completely different path. Even if we can’t get all the way there, maybe we can close enough that the remaining distance won’t matter. And often, the closer you get, the more feasible it becomes. In some cases, simply reframing the NSUI as a sequence of milestones rather than a binary goal can be all that is needed to make it feasible. (perhaps I should call this )

4. Today’s constraints are not tomorrow’s constraints

NSUIs make our design process more resilient and adaptable. I have often seen “unimplementable” solutions become implementable down the line, due to changes in internal or external factors, or simply because someone had a brilliant idea that made the impossible, possible. I have seen this happen so many times that I have learned to interpret “cannot be done” as “really hard — right now”.

When this happens, it’s important to have a solid foundation to fall back on, rather than having to go back to the drawing board because design and constraints were so intertwined we didn’t know where our actual design choices ended and the practical compromises began. With a solid NSUI in place, when constraints are lifted we only need to re-evaluate the compromises.

Change in Engineering Momentum: Sentiment Chips

Here is a little secret that applies to nearly all software engineers: neither feasibility nor effort are fixed for a given task.

Engineers are not automatons that will blindly implement whatever they are told to. Product managers are often content to get engineering to reluctantly agree to implement, but then you’re getting very poor ROI out of your engineering team.

Often all that is needed to make the infeasible, feasible is engineering momentum. Investing the extra time and energy to get engineers excited can really pay off. When good engineers are excited, they become miracle workers. The difference is not small, it is orders of magnitude. Things that were impossible or insurmountable become feasible, and things that would normally take weeks or even months can be prototyped in days.

One way to get engineers excited is to convince them about the value and utility of what they are building. It helps a lot to have them observe usability testing sessions and to be able to back product decisions up with data.

As I discovered last year by accident, there is also another, more …Machiavellian way to build engineering momentum: The NSUI is too hard? Propose a much easier solution that you know engineers would hate, such as one that turns a structured interaction into unstructured data. As much as I wish I could be that strategic 😅, this was not something I had planned in advance, but it was very effective in retrospect: I got an entire backend to work with that I had thought was entirely out of the question!

Change in the Environment: CSS Conic Gradients

background: conic-gradient(in hsl,
	red, orange, yellow, lime,
	cyan, blue, magenta, red);
Conical gradients are often used to render hue wheels.

Sometimes, the environment changes and a previously infeasible or high effort feature becomes feasible or even trivial. An example that comes to mind is CSS conic gradients. Conic gradients are the type of gradient that is created by (conceptually) rotating a ray around a center point.

I originally proposed adding conic gradients to CSS in 2011, and they first shipped in 2018 (in Chrome 69)! Someone observing this timeline without context may just conclude “pffft, standards just take forever to ship”. But there is always a reason, either technical, human, or both. In this case, the reason was technical. Browsers do not implement things like shadows and gradients from scratch, they use graphics libraries such as Skia, Cairo, or Core Graphics, which in turn are also abstractions over the OS-provided graphics APIs.

At the time these libraries did not support any primitive that could be used to render conic gradients (e.g. sweep gradients, mesh gradients, etc.). In the years that followed, one after the other added support for some kind of gradient primitive that could be used to easily render conic gradients, which took the proposal from high to low effort. I also created a polyfill which stimulated developer demand, increasing Impact. These two things together took the Impact/Effort ratio from “not worth it” to “let’s do this, stat” and in 2 years the feature was implemented in every major browser.

Someone has a Lightbulb Moment: Relaxed CSS Nesting Syntax

Sometimes high effort things just take a lot of hard work and there is no way around it. Other times they are one good idea away.

One of my favorite examples, and something I’m proud to have helped drive is the relaxed CSS Nesting syntax, now shipped in every browser. It is such an amazing case study on the importance of having a North Star UI, I even did an entire talk about it at Web Unleashed, with a lot more technical details that I have included here.

In a nutshell, CSS nesting is a syntax that allowed CSS developers to reduce repetition and better organize their code by allowing them to nest rules inside other rules.

table.browser-support {
	border-collapse: collapse;
}
table.browser-support th,
table.browser-support td {
	border: 1px solid silver;
}
@media (width < 600px) {
	table.browser-support,
	table.browser-support tr,
	table.browser-support th,
	table.browser-support td {
		display: block;
	}
}
table.browser-support th {
	border: 0;
}
table.browser-support td {
	background: yellowgreen;
}
table.browser-support td:empty {
	background: red;
}
table.browser-support td > a {
	color: inherit;
}
table.browser-support {
	border-collapse: collapse;

@media (width < 600px) { &, tr, th, td { display: block; } }

th, td { border: 1px solid silver; } th { border: 0; } td { background: yellowgreen;

&:empty { background: red; }

> a { color: inherit; } } }

Example of CSS code, with (right) and without (left) nesting. Which one is easier to read?

This is one of the few cases where the NSUI was well known in advance, since the syntax was well established in developer tooling (CSS preprocessors). Instead, the big challenge was navigating the practical constraints, since CSS implemented in browsers has different performance characteristics, so a syntax that is easily feasible for a preprocessor may be out of reach for a browser. In this case, the NSUI syntax had been ruled out by browser engineers due to prohibitive parsing performance [2], so we had to design a different, more explicit syntax that could be parsed more efficiently.

Initial attempts for a syntax that satisfied these requirements introduced a lot of noise, in the form of an awkward, noisy @nest token that needed to be placed in the beginning of many nested rules.

At this point, it is important to note that CSS Nesting is a feature that once available, it is used all over a stylesheet, not just a couple times here and there. For such widely used features, every character counts. Conciseness and readability of syntax are paramount, especially when conciseness is the sole purpose of this feature in the first place!

Worse yet, these attempts were actively incompatible with the NSUI syntax, as well as other parts of CSS (namely, the @scope rule). This meant that even if the NSUI became feasible later, CSS would need to forever support syntax that would then have no purpose, it would exist just as a wart from the past, just like HTML doctypes.

This proposal sat dormant for a while, since implementors were not exactly in a hurry to ship it. This all changed when State of CSS 2022 showed Nesting as the top missing CSS feature, making Google suddenly very keen to ship it.

A small subset of the CSS Working Group, led by Elika Etemad and yours truly organized a number of breakouts to explore alternatives, an effort that produced not one, not two, but four competing proposals. The one that the group voted to adopt [3] was the one I designed with the NSUI in mind, by asking the question: If the NSUI is out of the question right now, how close can we get and still be compatible with it in case it becomes feasible later on?

Once we got consensus on this intermediate syntax, I started exploring whether we could get any closer to the NSUI, even attempting to propose an algorithm that would reduce the number of cases that required the slower parsing to essentially an edge case. A few other WG members joined me, with my co-TAG member Peter Linss being most vocal. This is a big advantage of NSUI-compatible designs: it is much easier to convince people to move a little further along on the path they are already on, than to move to a completely different path. With a bit of luck, you may even find yourself implementing an “infeasible” NSUI without even realizing it, one step at a time.

We initially faced a lot of resistance from browser engineers, until eventually Anders Ruud and his team experimented with variations of my proposed algorithm and actually closed in on a way to implement the NSUI syntax in Chrome. The rest, as they say, is history.


  1. More elaborate prioritization schemes such as RICE are merely about breaking down either the Impact or the Effort or both into more granular components or introducing uncertainty into the equation. But ultimately, it’s all about the Impact/Effort ratio. ↩︎

  2. for any Compilers geeks out there that want all the deets: it required potentially unbounded lookahead since there is no fixed number of tokens a parser can read and be able to tell the difference between a selector and a declaration. ↩︎

  3. Originally dubbed “Lea’s proposal”, and later “Non-letter start proposal”, but became known as Option 3 from its position among the five options considered (including the original syntax). ↩︎

Product, Product Design, Product Management, User Centered Design, Product-Led Growth, North Star UI, Collaboration, Case Studies
Edit post on GitHub

Context Chips in Survey Design: “Okay, but how does it feel?”

16 min read Report broken page

The story of how a weird little UI to collect sentiment alongside survey responses defied constraints and triumphed over skepticism through usability testing.

Minimalistic skeleton diagram showing the concept presented in this article

One would think that we’ve more or less figured survey UI out by now. Multiple choice questions, checkbox questions, matrix questions, dropdown questions, freeform textfields, numerical scales, what more could one possibly need?!

And yet, every time Google sponsored me to lead one of the State Of … surveys, and especially the inaugural State of HTML 2023 Survey, I kept hitting the same wall; I kept feeling that the established options for answering UIs were woefully inadequate for balancing the collection good insights with minimal friction for end-users.

The State Of surveys used a completely custom survey infrastructure, so I could often (but not always) convince engineering to implement new question UIs. After joining Font Awesome, I somehow found myself leading yet another survey, despite swearing never to do this again. 🥲 Alas, building a custom survey UI was simply not an option in this case; I had to make do with the existing options out there [1], so I felt this kind of pain to my core once again.

So what are these cases where the existing answering UIs are inadequate, and how could better ones help? I’m hoping this case study to be Part 1 of a series around how survey UI innovations can help balance tradeoffs between user experience and data quality, though this is definitely the one I’m most proud of, as it was such a bumpy ride, but it was all worth it in the end.


  1. Unlike Devographics, surveys are not FA’s core business, so the Impact/Effort tradeoff simply wasn’t there for a custom UI, at least at this point in time. I ended up going with Tally, mainly due to the flexibility of its conditional logic and its support for code injection (which among other things, allowed me to use FA icons — a whopping 120 different ones!). ↩︎

Continue reading

Survey Design, Product, Product Design, Design Thinking, Case Studies, UX, Usability, North Star UI
Edit post on GitHub

Eigensolutions: composability as the antidote to overfit

14 min read 0 comments Report broken page

tl;dr: Overfitting happens when solutions don’t generalize sufficiently and is a hallmark of poor design. Eigensolutions are the opposite: solutions that generalize so much they expose links between seemingly unrelated use cases. Designing eigensolutions takes a mindset shift from linear design to composability.

Continue reading

Product, Design Thinking, Creator Tools, Product Management, North Star UI
Edit post on GitHub