7 posts on Product • Lea Verou

A framework for User-Centered Decisions

24 March 2025 24 min read Report broken page

TBD: Lacks a conclusion, illustrations, and examples.

I was recently asked to describe the guiding principles I would use to weigh three different solutions to a specific user pain point, with an emphasis on the user/customer. There are typically other factors to consider, such as engineering effort, business goals, etc., but these were out of scope in that particular case.

Since most prioritization frameworks include a user-centered / impact component, the framework discussed here can complement them nicely, by simply replacing some of the factors with its outcome. For example, if using RICE, you can use this framework to calculate R×I, then proceed to multiply by C/E as usual (being mindful of units).

The Three Axes

Utility and Usability (which Nielsen groups under Usefulness) are considered the core pillars of Product-led Growth. However, both Utility and Usability are short-term metrics and do not consider the bigger picture, so only using them as a compass could result in short-sightedness. I think there is also a third axis, which I call Evolution, and refers to how well a feature fits into the bigger picture, by examining how it relates to the product’s past and (especially) future.

Alt text

Utility (aka Impact): How many use cases and user pain points does it address, how well, and how prominent are they?
Usability: How easy is it to use? Evaluating usability at the idea stage is tricky, as overall usability will vary depending on how each idea is implemented, and there is often a lot of wiggle room within the same idea. At this stage, we are only concerned with aspects of usability inherent in the idea itself.
Evolution: How does it relate to features that we have shipped in the past and features we may ship in the future? Being mindful of this prevents feature creep and ensures the mental models exposed via the UI remain coherent.

These are not entirely independent, there are complex interplays between them:

Evolution affects Usability: Features that fit poorly into the product’s past and future will create later usability issues. However, treating it as a separate factor helps us catch these issues much earler, and at the right conceptual level.
Utility and Usability can often be at odds: the more powerful a feature is, the more challenging it is to make it usable.

Now let’s discuss each axis in more detail.

Utility

Utility measures the value proposition of a feature for users. It can be further broken down to:

Raising the ceiling: What becomes possible? Does it enable any use cases for which there is no workaround?
Lowering the floor: What becomes easier? Does it provide a better way to do something for which there is already a workaround? How big is the delta?
Widening the walls: Does it serve an ignored audience or market? Does it broaden the set of use cases served by the product?
Use Case Significance: How important are the use cases addressed?

While this applies more broadly, it is particularly relevant and top priority for creative tools.

In evaluating the overall Utility of an idea, it can often be helpful to list primary and secondary use cases separately, and evaluate Significance for them separately.

Primary & Secondary use cases

Primary use cases are those for which a solution is optimal (or close), and have often been the driving use cases behind its design. This is to contrast with secondary use cases, for which a solution is a workaround. Another way to frame this is friction: How much friction does the solution involve for each use case? For primary use cases, that should be close to 0, whereas for secondary use cases it will be higher.

A good design will ideally have a healthy amount of both. Lack of secondary use cases could hint that the feature may be overly tailored to specific use cases (overfitting).

The north star goal should obviously be to address all use cases head-on, with zero friction. But since resources are finite, enabling workarounds buys us time. There is far less pressure to solve a use case for which there is a workaround, than one that is not possible at all. The latter contributes to churn far more directly.

It is not unheard of to ship a feature with a low number of primary use cases, simply because it has a high number of secondary use cases, and will buy us time to work on better solutions for them. In these cases, Evolution is even more important: when we later have addressed all these use cases head-on, does this feature still serve a purpose?

Use Case Significance

This is a rough measure of how important the features addressed are. This needs to be evaluated holistically: an incremental improvement for a common interaction is far more impactful than a substantial improvement for a niche use case.

Some ways to reason about it may be:

Frequency: How frequently do these use cases come up in a single user journey?
Reach: What percentage of users do they affect?
Criticality: How much do they matter to users? Are they a nice-to-have or a dealbreaker?
Vision: How do the use cases relate to the software’s high level goals?

Vision may at first seem more related to the business than the user. However, when software design loses sight of its vision, the result can be a confusing, cluttered user experience that doesn’t cater to any use case well.

Usability

There are many ways to break usability down into independent, quantifiable dimensions. I generally go with a tweaked version of the one I first learned at MIT’s UI Design & Implementation course I took in 2016 (and then taught in 2018 and replaced in 2020 😅), bringing it one step closer to the original Nielsen dimensions by re-adding Satisfaction:

Learnability: How easy is it for users to understand?
Efficiency: Once learned, is it fast to use?
Safety (aka Errors): Are errors few and recoverable?
Satisfaction: How pleasant is it to use?

Some examples of usability considerations and how they relate to these dimensions:

Learnability

Compatibility: Does it re-use existing concepts or introduce new ones?
Internal Consistency: How consistent is it with the way the rest of the product works?
External Consistency: How consistent is it with the environment (other products, related domains, etc.)?
Memorability: When users return to the design after a period of not using it, how easily can they reestablish proficiency?

Efficiency

Speed: How many steps does it take to accomplish a task and how long does each step take?
Cognitive Load: How much mental effort does it require?
Physical Load: How much physical effort does it require?

Safety

Error-proneness: How hard is it for users to make mistakes?
Error severity: How severe are the consequences of mistakes?
Recoverability: How easy is it to recover from mistakes?

Satisfaction

Aesthetics: How visually pleasing is it?
Ergonomics: How comfortable is it to use?
Enjoyment: How fun is it to use?

Satisfaction is a bit of an oddball. First, it has limited applicability to certain types of UIs, e.g. non-interactive text-based UIs (programming languages, APIs, etc.). Even where it applies, it can be harder to quantify. But most importantly, when deciding between ideas there is rarely enough signal to gauge satisfaction. If it’s not helpful for your use case, just leave it out.

Each idea will rarely have universally worse or better usability than another. More commonly, it will be better in some dimensions and worse in others. To evaluate these tradeoffs, we need to understand the situation and the user.

The situation

“Situation” here refers to the use case plus its context.

The more repetitive or common the task, the higher the importance of Efficiency. For example text entry is an area where efficiency needs to be optimized down to individual keystrokes or minute pointing movements. On the other end of the spectrum, for highly infrequent tasks, users don’t have time to develop transferable knolwedge across uses and thus Learnability is very important (e.g. tax software, visa applications). Last, the more there is at stake, the more important Safety becomes. Some examples of cases where Safety is top priority would be missile launches, airplane navigation, healthcare software on a macro scale, or privacy, data integrity, finances on a micro scale.

There is granularity here as well. For example, a visa application is used infrequently enough that learnability matters far more than efficiency for the product in general. However, if it includes a question where it expects the user to enter their last 20 international trips, efficiency for trip entry is important.

Sometimes, two factors may genuinely be equally important. Consider a stock trading program used on a daily basis by expert traders. Lost seconds translate to lost dollars, but mistakes also translate to lost dollars. Is Efficiency or Safety more important?

Note that there are also interplays between different dimensions: the more effort a task involves (efficiency), the more high stakes a mistake is perceived to be (safety). You have likely experienced this: a lengthy form losing your answers feels a lot more frustrating than having to re-enter your email in a login form.

The user

As a general rule of thumb, novices need learnability whereas for experts other dimensions of usability are more important. But who is an expert? Expert in what?

Application expertise is orthogonal to domain expertise. Tax software for accountants needs good learnability in terms of application features, but can assume familiarity with tax concepts (but not necessarily recall). Conversely, tax software for regular taxpayers needs both: as software that is typically only used once a year, learnability in terms of application features is top priority. But abstracting and simplifying tax concepts is also important, as most users are not very proficient in them.

Generally speaking, the more we can rely on training, the less important learnability becomes. This is why airplane cockpits are so complex: pilots have spent years of training learning to use these UIs, so efficiency and safety are prioritized instead (or at least should be — sadly that is not always the case).

That said, there is often an opportunity for disruption here, by taking a product that has the potential to bring value to many but currently requires lengthy training, and creating one that requires little to none. Creator tools are prime candidates for this, with no-code/low-code tools being a flagship example right now. However, almost every mainstream technology went through this kind of democratization at some point: computers, cameras, photo editing, video production, etc.

This distinction does not only apply to the product as a whole, but also individual product areas. For example, an onboarding flow needs to prioritize learnability regardless of the priorities of the rest of the product.

Evolution

Evolution is a bigger picture measure of how well a proposed feature fits into the product’s past, present and future, with an emphasis on the latter, since relationship to the past and present is also the Internal Consistency component of Learnability.

When evaluating compatibility with potential future evolution, it’s important to not hold back. Ten years down the line, when today’s implementation constraints, technology limitations, or resource limits are no more, what would we ship and how does this feature relate to it? Does it have a place in that future, is it entirely unnecessary, or — worse — does it actively conflict with it?

This is to avoid feature creep by ensuring that features are not designed ad hoc, but they contribute towards a coherent conceptual design.

The most common way for a feature to connect to the product’s past, present, and future is by being a milestone across a certain axis of progress:

Level of abstraction (See Layering):
- Is it a shortcut to a present or future lower level primitive?
- Is it a lower level primitive that explains existing functionality?
Power: Is it a less powerful version of a future feature?
Granularity: Is it a less granular version of a future feature?

If we have a north star UI, part of this is to consider whether a proposed feature is compatible with it or actively diverges.

A feature could also be entirely orthogonal to all existing features and still be a net win wrt Evolution. For example, when it helps us streamline UX by allowing us to later remove another feature that has been problematic.

Weighing tradeoffs

While all three are very important, they are not equally important. In broad strokes, usually, Utility > Usability > Evolution. Here’s why:

Utility > Usability: If a product does not provide value, people leave, even if it provides a fantastic user experience for the few and/or niche use cases it actually serves.
Usability > Evolution, since Evolution is a long-term / more speculative concern, whereas Usability a more immediate / higher confidence one.

Depending on the product and the environment however, this trend could be reversed:

Competition: If a product is competing in a space where use cases are already covered very well, but by products with poor usability, Usability becomes more important. In fact, many successful products were actually usability innovations: The Web, Dropbox, the iPhone, Zoom, and many others.
Mutability: Change is always hard, but for some products it’s a lot harder, making a solid Evolution path more important. Web technologies are an extreme example: it is almost impossible to remove or change anything, ever, as there are billions of uses in the wild, no way to migrate them, and no control over them. Instead, changes have to be designed as new technologies or additions to existing ones.
Complexity: The more complexity increases, the more important it becomes to keep further increase at bay, so Evolution becomes more important.

Ok, now make the darn decision already!

So far we’ve discussed various tradeoffs, so it may be unclear how to use this as a framework to make actual decisions.

Decision-making itself also involves tradeoffs: adding structure makes it easier to decide, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure if the previous step failed to provide enough clarity. For simple, uncontroversial decisions, just discussing the three axes can be sufficient, and the cost-benefit of more structure is not worth it. But for more complex higher stakes decisions, a more structured approach can pay off.

Let’s consider the goals for any scoring framework:

Compare and contrast: Make an informed decision between alternatives without being lost in the complexity of their tradeoffs.
Drive consensus: It is often easier for a team to agree on a rating or weight for an individual factor, than the much bigger decision of which option to go with.
Communicate: Provide a way to communicate the decision to stakeholders, so they can understand the rationale behind it.

Calculating things precisely (e.g. use case coverage, significance, reach etc.) is rarely necessary for any of them, and thus not a good use of time. Remember that the only purpose of scores is to help us compare alternatives. They have no meaning outside of that context. In the spirit of an iterative approach, start with a simple 1-5 score for each factor, and only add more granularity and/or precision if that does not suffice for converging to a decision.

We can use three tables, one for each factor, with a row for each idea. Then the columns are:

Utility

Primary use cases
Secondary use cases
Utility Score (1-5)

Usability

Learnability
Efficiency
Safety
Usability Score (1-5)

Evolution

Past & Present
Future
Evolution Score (1-5)

We fill in the freeform columns first, which should then give us a pretty clear picture of the score for each factor.

Finally, using the 3:2:1 ratio mentioned above, the overall score would be:

O v e r a l l_S c o r e = \frac{3 \cdot U t i l i t y_S c o r e + 2 \cdot U s a b i l i t y_S c o r e + 1 \cdot E v o l u t i o n_S c o r e}{3 + 2 + 1}

Template: User-Centered Decision Worksheet

I have set up a Coda template for this, which you can copy and fill in with your own data.

Why Coda instead of something like Google Docs or Google Sheets?

I don’t have to repeat each idea in the multiple tables, I can set them up as views and they update automatically
Rich text (lists, etc) within table cells make it easier to brainstorm
One-click rating widgets for scores (great when iterating)
I can output the overall score for each feature with a formula, and it updates automatically. No need to clumsily copy-paste it across cells either, I can just define it once for the whole column. I can even use controls for the weights that are outside the table entirely.
This may be subjective, but I find Coda docs more well designed than any alternative I’ve tried.

Screenshot of Coda tooltip — As a bonus, I can then even @-mention each feature in the rest of the doc, and hovering over it shows a tooltip with all its metadata!

Tradeoff scorecard: The ultimate decision-making framework

24 March 2025 5 min read Report broken page

Every decision we make involves weighing tradeoffs, whether that is done consciously or not. From evaluating whether an extra bedroom is worth $800 extra in rent, whether being able to sleep lying down during a long flight is worth the $500 upgrade cost, to whether you should take a pay cut for that dream job.

For complex high-stakes decisions involving many competing tradeoffs, trying to decide with your gut can be paralyzing. The complex tradeoffs that come up when designing products ^[1] fall in that category so frequently that analytical decision-making skills are considered one of the most important skills a product manager can have. I would argue it’s a bit broader: analytical decision-making is one of the most useful skills a human can have.

Structured decision-making is a passion of mine (perhaps as a coping mechanism for my proneness to analysis paralysis). In fact, one of the very first demos of Mavo (the novice programming language I designed at MIT) was a decision-making tool for weighed pros & cons. It was even one of the two apps our usability study participants were asked to build during the first Mavo study. I do not only use the techniques described here for work-related decisions, but for any decision that involves complex tradeoffs (often to the amusement of my friends and spouse).

Screenshot of the Decisions Mavo app — The Decisions Mavo app, one of the first Mavo demos, is a simple decision-making tool for weighed pros & cons.

Before going any further, it is important to note a big caveat. Decision-making itself also involves tradeoffs: adding structure makes decisions easier, but consumes more time. To balance this, I tend to favor an iterative approach, adding more precision and structure only if the previous step failed to provide clarity. Each step progressively builds on the previous one, minimizing superfluous effort.

For very simple, uncontroversial decisions, just discussing or thinking about pros and cons can be sufficient, and the cost-benefit of more structure is not worth it. Explicitly listing pros and cons is probably the most common method, and works well when consensus is within reach and the decision is of moderate complexity. However, since not all pros and cons are equivalent, this delegates the weighing to your gut. For more complex or controversial decisions, there is value in spending the time to also make the weighing more structured.

The tradeoff scorecard

What is a decision matrix?

A decision matrix, also known as a scorecard, is a table with options as rows and criteria as columns, with a column in the end that calculates a score for each option based on the criteria. These are useful both for selection, as well as prioritization, where the score is used for ranking options. In selection use cases, the columns can be specific to the problem at hand, or predefined based on certain principles or factors, or a mix of both. Prioritization tends to use predefined columns to ensure consistency. There is a number of frameworks out there about what these columns should be and how to calculate the score, with RICE (Reach × Impact × Confidence / Effort) likely being the most popular.

The Tradeoff Scorecard is not a prioritization framework, but a decision-making framework for choosing among several options.

Qualitative vs. quantitative tradeoffs

Typically, tradeoffs fall in two categories:

Qualitative: Each option either includes the tradeoff or it doesn’t. Thing of them as tags that you can add or remove to each option.
Quantitative: The tradeoff is associated with a value (e.g. price, effort, number of clicks, etc.)

Not all tradeoffs are all equal. Even for qualitative tradeoffs, some are more important than others, and the differences can be quite vast. Some strengths may be huge advantages, while others minor nice-to-haves. Similarly, some weaknesses can be dealbreakers, while others minor issues.

We can model this by assigning a weight to each tradeoff (typically a 1-5 or 1-10 integer). But if quantitative tradeoffs have a weight, doesn’t that make them quantitative? The difference is that the weight applies to the tradeoff itself, and is applied the same to each each option, whereas the value of quantitative tradeoffs quantifies the relationship between tradeoff and option and thus, is different for each. Note that quantitative tradeoffs also have a weight, since they also don’t all matter the same.

In diagrammatic form, it looks a bit like this: Simplified UML-like diagram showing that each tradeoff has a weight, but the relationship between option and quantitative tradeoff also has a value

These categories are not set in stone. It is quite common for qualitative tradeoffs to become quantitative down the line as we realize we need more granularity. For example, you may start with “Poor discoverability” as a qualitative tradeoff, then realize that there is enough variance across options that you instead need a quantitative “Discoverability” factor with a 1-5 rating. The opposite is more rare, but it’s not unheard of to realize that a certain factor does not have enough variance to be worth a quantitative tradeoff and instead should be modeled as 1-2 qualitative tradeoffs.

The overall score of each option is the sum of the scores of each individual tradeoff for that option. The score of each tradeoff is often simply its weight multiplied by its value, using 1/-1 as the value of qualitative tradeoffs (pro = 1, con = -1).

While qualitative tradeoffs are either pros or cons, quantitative tradeoffs may not be universally positive or negative. For example, consider price: a low price is a strength, but a high price is a weakness. Similarly, effort is a strength when low, but a weakness when high. Calculating a score for these types of tradeoffs can be a bit more involved:

For ratings, we can subtract the midpoint and use that as the value. E.g. by subtracting 3 from a 1-5 rating we get value from -2 to 2. Adjust accordingly if you don’t want the switch to happen in the middle.
For less constrained values, such as prices, we can use the value’s percentile instead of the raw number.

Explicit vs implicit tradeoffs

When listing pros and cons across many choices, have you noticed that there is a lot of repetition? First, several options share the same pros and cons, which is expected, since they are alternative solutions to the same problem. But also because pros and cons come in pairs. Each strength has a complementary weakness (which is the absence of that strength), and vice versa.

For example, if one UI option involves a jarring UI shift (a bad thing), the presence of this is a weakness, but its absence is a strength! In other words, each qualitative tradeoff is present on all options, either as a strength or as a weakness. The decision of whether to pick the strength or the weakness as the primary framing for each tradeoff is often based on storytelling and/or minimizing effort (which one is more common?). A good rule of thumb is to try to avoid negatives (e.g. instead of listing “no jarring UI shift” as a pro, list “jarring UI shift” as as con).

It may seem strange to view it this way, but imagine you were trying to compare and contrast five different ideas, three of which involved a jarring UI shift. You would probably list “no jarring UI shifts” as a pro for the other two, right?

This realization helps cut the amount of work needed in half: we simply assume that for any tradeoff not explicitly listed, its opposite is implicitly listed.

Putting it all together

Your choice of tool can make a big difference to how easy this process is. In theory, we could model all tradeoffs as a classic decision matrix, with a column for each tradeoff. Quantitative tradeoffs would correspond to numeric columns, while qualitative tradeoffs would correspond to boolean columns (e.g. checkboxes).

Indeed, if all we have is a grid-based tool (e.g. spreadsheets), we may be stuck doing exactly that. It does have the advantage that it makes it trivial to convert a qualitative tradeoff to a quantitative one, but it can be very unwieldy to work with.

If our tool of choice supports lists within cells, we can do better. These boolean columns can be combined into one column as a list of all relevant tradeoffs. Then, a separate table can be used to define weights for each tradeoff (and any other metadata, e.g. optional notes).

I currently use Coda for these tradeoff scorecards. While not perfect, it does support lists in cells, and has a few other features that make working with tradeoff scorecards easier:

Thanks to its relation concept, the list of tradeoffs can be actually linked to their definitions. This means that hovering over each tradeoff displys a popup with its metadata, and that I can add tradeoffs by selecting them from a popup menu.
Conditional formatting allows me to color-code tradeoffs based on their type (strength/weakness or pro/con) and weight (lighter color for smaller impact).
Its formula language allows me to show and list the implicit tradeoffs for each option (though there is no way to have them be color-coded too).

There are also limitations however:

While I can apply conditional formatting to color-code the opposite of each tradeoff, I cannot display implicit tradeoffs as nice color-coded chips, in the same way as explicit tradeoffs, since relations can only display the primary column.
Weights for quantitative tradeoffs have to be baked in the formula (there are some ways to make them editable, but )

I use product in the general sense of a functional entity designed to fulfill a specific set of needs or solve particular problems for its users. This does not only include commercial software, but also things like web technologies, open source libraries, and even physical products. ↩︎

Evolution: The missing component of Product-Led Growth

24 March 2025 6 min read Report broken page

TBD: Lacks a conclusion, illustrations, and examples.

What is Product-Led Growth?

In the last few years, Product-Led Growth has seen a meteoric rise in popularity. The idea is simple: instead of relying on sales and marketing to acquire users, you build a product that sells itself. As a usability advocate, this makes me giddy: Prioritizing user experience is now a business strategy, with senior leadership buy-in!

NN/G considers Utility and Usability the core components of Product-led Growth, which Nielsen groups under a single term: Usefulness. Utility refers to how many use cases are addressed, how well, and how significant these use cases are. If you thought that sounds very much like the RI in RICE, you’d be right, they are indeed roughly the same concept, from a different perspective. Usability, as you probably know, refers to how easy the product is to use, and can be further broken down into individual components, such as Learnability, Efficiency, Safety, and Satisfaction.

Indeed, maximizing Utility and Usability are crucial for creating products that add value. However, both suffer from the same flaw: they are short-term metrics, and do not consider the bigger picture over time. It’s like playing chess while only thinking about the next move. You could be making excellent choices on each turn and still lose the game. Great Utility and Usability alone do not prevent feature creep. We can still end up with a convoluted user experience that lacks a coherent conceptual model; all it takes is enough time.

Therefore, I think there is also a third component, which I call Evolution. Evolution refers to how well a feature fits into the bigger picture of a product, by examining how it relates to its past, present and future (or, more accurately, its various possible futures). By prioritizing features higher when they are part of a trajectory or greater plan and deprioritizing those that are designed ad hoc we can limit complexity, avoid feature creep, and ensure we are moving towards a coherent conceptual design.

Introducing entirely new concepts is not an antipattern by any means, that’s how products evolve! However, it should be done with caution, and the bar to justify such features should be much higher.

The three axes are not entirely independent. Evolution will absolutely eventually affect Usability. The whole point of treating Evolution as a separate axis is that this allows us to catch these issues early and prevent them in the making. By the time conceptual design issues create usability problems, it’s often too late. The changes required to fix the underlying design are a lot more substantial and costly.

The weight of Evolution

The importance of Evolution was really drilled into me while designing web technologies, i.e. the technologies implemented in browsers that web developers use to develop websites and web applications. We do not have a name for it, but the consideration is very high priority when designing any feature for the Web.

In general, Utility and Usability matter more than Evolution. Just like in chess, the next move is far more important than any subsequent move. The argument this post is making is that we should look further than the current roadmap, not that we should stop looking at what’s right in front of us. However, there are some cases where Evolution may become equally important as the other two, or even more.

Low mutability is one such case. Change is always hard, but for some products it’s a lot harder. Web technologies are an extreme example, where you can never remove or change anything. There are billions of uses in the wild, that you have no control over, and no way to migrate users. You cannot risk breaking the Web. Instead, changes must be designed as either additions to existing technologies, or (if substantial enough) as entirely new technologies. The best you can hope for is that if you deprecate the old technology, and you heavily promote the new one, over many years usage of the old technology will drop below the usage threshold that allows considering removal (< 0.02%!). I have often said that web standards work is “product work on hard mode”, and this is one of the reasons. If you do product work, pause for a moment and consider this: How much harder would shipping be if you knew you could never remove or change anything?

Another case is high complexity. Many things that are complex today began as simple things. The cost of adding features without validating their Evolution story is increasing complexity. To some degree, complexity is the fate of every successful product, but being deliberate about adding features can curb the rate of increase. Evolution tends to become higher priority as a product matures. This is artificial: keeping complexity at bay is just as important in the beginning, if not more. However, it is often easier to see in retrospect, after we’ve already felt the pain of increasing complexity.

The value of a North Star UI

In evaluating Evolution for a feature, it’s useful to have alignment on what our “North Star UI(s)” might be.

A North Star UI is the ideal UI for addressing a set of use cases and pain points in a perfect world where we have infinite resources and no practical constraints (implementation difficulty, performance, backwards compatibility, etc.). Sure, many problems are genuinely so hard that even without constraints, the ideal solution is still unknown. However, there many cases where we know exactly what the perfect solution would be, but it’s simply not feasible, so we need to keep looking.

In these cases, it’s useful to document this “North Star UI” and ensure there is consensus around it. You can even do usability testing (using wireframes or prototypes) to validate it.

Why would we do this for something that’s not feasible? First, it can still be useful as a guide to steer us in the right direction. Even if you can’t get all the way there, maybe you can close enough that the remaining distance won’t matter. And in the process, you may find that the closer you get, the more feasible it becomes.

Second, it ensures team alignment, which is essential when trying to decide what compromises to make. How can we reach consensus on the right tradeoffs if we are not even aligned on what the solution would be if we didn’t have to make any compromises?

Third, it builds team momentum. Doing usability testing on a prototype can do wonders for getting people on board who may have previously been skeptical. I would strongly advise to include engineers in this process, as engineering momentum can literally make the difference between what is possible and what is not.

Last, I have often seen “unimplementable” solutions become implementable later on, due to changes in internal or external factors, or simply because a brilliant engineer had a brilliant idea that made the impossible, possible. In my 11 years of designing web technologies, I have seen this happen so many times, I now interpret “cannot be done” as “really hard — right now”.

Mini Case study 1: CSS Nesting Syntax

My favorite example, and something I’m proud to have personally helped drive is the current CSS Nesting syntax, now shipped in every browser. We had plenty of signal for what the optimal syntax was for users (North Star UI), but it had been vetoed by engineering across all major browsers due to prohibitive performance, so we had to design around certain parsing constraints. The original design was quite verbose, actively conflicted with the NSUI syntax, and had poor compatibility with another related feature (@scope). Instead of completely diverging, I proposed a syntax that was a subset of our NSUI, just more explicit in some (common) cases. Originally discussed as “Lea’s proposal”, it was later named “Non-letter start proposal” but became known as Option 3 from its position among the five options considered. After some intense weighing of tradeoffs and several user polls and surveys, the WG resolved to adopt that syntax.

Once we got consensus on that, I started trying to get people on board to explore ways (and brainstorm potential algorithms) to bridge the gap. A few other WG members joined me, with my co-TAG member Peter Linss perhaps being most vocal. We initially faced a lot of resistance from browser engineers, until eventually a couple Chrome engineers closed on a way to implement the north star syntax 🎉, and as they say, the rest is history.

It was not easy to get there, and required weighing Evolution as a factor. There were diverging proposals that in some ways had better syntax than that intermediate milestone. If we only looked at the next move, if we had only used Utility and Usability to guide us, we would have made a suboptimal long-term decision.

Evaluating Evolution

To evaluate Utility, we can look at the use cases a feature addresses, and how significant they are. Evaluating Usability is also a matter of evaluating its individual components, such as Learnability, Efficiency, Safety, and Satisfaction. This can be done via usability testing, or heuristic evaluation, and ideally both. But how do we evaluate Evolution for a proposed feature?

How well it fits with the product’s past and present overlaps with Usabilty (through Internal Consistency, a component of Learnability), but is also important to consider.

When evaluating how well a feature fits into the product’s future, we can use the north star UI if we have one, as well as other related features that could plausibly be shipped in the future (e.g. have already been discussed, or are natural evolutions of existing features).

Does this feature connect to the product’s past, present, and future across a certain axis of progress? For example:

Level of abstraction (See Layering):
- Is it a shortcut to a present or future lower level primitive?
- Is it a lower level primitive that explains existing functionality?
Power: Is it a less powerful version of a future feature?
Granularity: Is it a less granular version of a future feature?

Other considerations:

Opportunity cost: What does introducing this feature prevent us from doing in the future?
Simplification: What does it allow us to remove?

TBD: Lacks a conclusion, illustrations, and examples.

What is a North Star UI and how can it help you ship?

24 March 2025 12 min read Report broken page

You may be familiar with this wonderful illustration and accompanying blog post by Henrik Kniberg:

It’s a very visual way to illustrate the age-old concept that that a good MVP is not the one developed in isolation over months or years, grounded on assumptions about user needs and goals, but one that delivers value to users as early as possible, so that future iterations can take advantage of the lessons learned from real users.

The MVP as a range

While not quite what Henrik intended, I love this metaphor so much, I have been using it to describe shipping goals when writing product specs. I find they are understandable to anyone who has seen Henrik’s illustration, and fit nicely into a fixed time, variable scope development process, such as the Shape Up methodology that we use at Font Awesome.

🛹 The Skateboard (aka the Pessimist’s MVP): What is the absolute minimum we can ship, if need be? This is the most bare-bones set of features, without which we cannot ship at all. It skews more utilitarian: it has the basic functionality we need, but its UX is very rough, even embarrassing. Anything that can be flintstoned is flintstoned. This is meant to be less-than a traditional MVP.
🛴 The Scooter (aka the Realist’s MVP): It is the minimum set of features we want to ship that will still provide value and fulfill enough user needs across enough user segments to be worth it. Its UX is more well thought out than the skateboard but anything nontrivial to implement is punted unless essential. This is closer to a traditional MVP.
🚲 The Bicycle (aka the Optimist’s MVP): The wishlist or stretch goals. If everything goes really well, what else can we ship? This may include UX improvements, “sprinkles of delight”, and features that are nonessential but have high I/E ratios. This is where we aspire to be, but we are not going to be heartbroken if we don’t get there.
🏍️ The motorcycle: These are improvements that are beyond even the optimistic MVP, but we want to get to sometime in the near future.
🚗 The car: Improvements that we can ship in the medium to longer term future.
🏎️ The race car (aka the North Star UI): This is the ideal product we would ship if we were not bound by ephemeral constraints like time, resources, performance considerations, or backwards compatibility.

The meat is the first three stages, since they directly affect what is being worked on. The more we go down the list, the less fleshed out specs are, since we know they will change once we have input to customers.

The most controversial of these is the last one: the race car, i.e. the North Star UI. It is the very antithesis of the MVP. The MVP describes what we can ship ASAP, whereas the North Star describes the most idealized goal, one we may never be able to ship.

It is easy to dismiss that as a waste of time, a purely academic exercise. “We’re all about shipping. Why would we spend time on something that’s not even feasible?” I hear you say. However, in the rest of this essay I will argue that it is one of the most important milestones, and fleshing it out pays dividends in the long run.

More deliberate product design

TODO: Race car with arrows pointing to car, motorcycle, bike, scooter, skateboard.

Whether you realize it or not, the North Star is the only actual input into this process. Every other stage, the skateboard, the scooter, the bike, the motorcycle, the car, are all outputs. They are all derived from the North Star, like peeling layers off an onion. In fact in some contexts the process of breaking down a bigger shipping goal into milestones that can ship independently is literally called layering.

The process is so ingrained, so automatic, that most product designers don’t realize that they are doing it. They go from race car to car, or even motorcycle so quickly they barely realize there was anything else there to begin with. Thinking about the North Star feels like a guilty pleasure — who has time for this daydreaming? We gotta ship, yesterday!

But the race car is fundamental. Without it, there is no skateboard — you can’t reduce the unknown. Without a solid North Star, your MVP is a confused jumble of design decisions and compromises, so tangled it becomes impossible to tell them apart.

To stick with the transportation metaphor, a skateboard might be a good MVP if your ultimate vision is a race car, but it would be a terrible minimum viable ferry boat — you might want to try a wooden raft for that.

TODO: first, the most basic raft possible. Then a simple sailboat, then a speedboat, then a yacht, and finally a ship.

This North Star may (and likely, will) change down the line, informed by customer feedback. That’s okay and par for the course. We don’t need to wander aimlessly with no destination, to be able to course correct.

Perhaps counterintuitively, spending time fleshing out a North Star UI can actually help you ship. Allow me to explain.

Simplify problem solving

A common problem-solving strategy in every domain, is to break down a complex problem into smaller, more manageable components and solving them separately. Product design is no different. The concept of a North Star UI breaks down tough product design problems into three more manageable components:

North Star UI (NSUI): What is the ideal solution?
Ephemeral constraints: What prevents us from getting there?
Compromises: How close can we reasonably get given these constraints?

In many simpler problems, their difficulty is concentrated in only one of these components, in which case this framework does not help much. Where it really shines is the really tough product problems, where the first and third question are both hard. Far easier to answer them separately than trying to answer both at once.

Facilitate team alignment

TODO: Two people arguing. One has a speech bubble with a skateboard, the other a speech bubble with a wooden raft. The first also has a thought bubble with a car, the second a thought bubble with a ship.

When the North Star UI is not clearly articulated, it doesn’t mean it doesn’t exist. It just means that everyone is following a different North Star.

Since MVPs are products of the North Star, this will manifest as difficulty reaching consensus at every step of the way.

Debating at a different level of abstraction than what produced the original disconnect is generally a recipe for nonterminating divergence. It pays off to zoom out and resolve the root cause separately, rather than waste time and energy debating its byproducts one after another, like fighting off a Lernaean Hydra one head at a time.

Having the space to flesh out the North Star UI separately not only eliminates future disagreements before they happen, it also strips away a lot of noise.

Often, what is fundamentally a North Star disagreement will masquerade as a disagreement about practical constraints. It feels easier to cite practical constraints than to debate the merits of an idea directly. Fleshing out the North Star UI separately eliminates this deflection at the root. Here is a story that may sound familiar: Alice has designed an eigensolution, — an elegant solution that addresses several user pain points at once. She is aware it would be a little tricky to implement, but she thinks the tremendous improvement in user experience is worth it and she can layer it in such a way that it can ship incrementally. When she presents her idea to the product team, Bob dismisses it as “this is way too much work, it’s not worth doing”. However, what he is actually thinking is “this is a bad idea and any amount of work towards it is a waste”. Instead of spending time figuring out whether Alice’s concept is a good idea, they spend their time discussing how much work it is and whether it could be reduced. As a result, they fail to reach consensus because the amount of work was not the core issue.

It is important to answer the questions above in order, and reach consensus on what the North Star UI is before moving on to the compromises. This way, we are aware of what is an actual design decision and what is a compromise driven by practical constraints. Articulating these separately, allows us to debate them separately. It is very hard to evaluate tradeoffs collaboratively if you are not on the same page about what we are trading off and how much it’s worth. You need to know both the cost and the benefit to do a cost-benefit analysis.

North Star UIs can improve MVPs via user testing

Conventional wisdom is that we strip down the North Star to an MVP, ship that, and then iterate based on input from real users. But did you know you can actually get input from real users without writing a single line of code?

While common knowledge among usability folks, this seems unheard of in product management circles. You don’t need to wait until an implemented MVP to do user testing. You can do user testing as early as a low-fi paper prototype, with the user telling you where they would click or tap and the facilitator mocking the response. This allows you to user test your North Star UI directly and adjust your MVP accordingly without having to wait for a whole deployment cycle.

Obviously, this works better for some types of products than others (it is notably hard to mock rich interactions or UIs with too many possible responses), but it is a powerful tool to have in your arsenal. It can be particularly useful useful when there are vastly different perspectives within a team about what the North Star UI is, or when the problem is so novel that every potential solution is on shaky ground. Even the best product intuition can be wrong, and there is no point in evaluating compromises if it turns out that even the “perfect” solution is not actually a good one.