Experimentation School
July 24, 2024

Chetan Sharma On Institutional Knowledge

Che Sharma
Eppo's Founder and CEO, former early data scientist who built experimentation tools and cultures at Airbnb and Webflow

In this video, explore the importance of institutional knowledge and how to create a system that keeps your team's learnings accessible and useful for years to come, featuring insights from Airbnb’s Knowledge Repo.

Transcript

What we're really talking about here is all the things you think you know about your customers, that you think you know about your business, what's led to success, what has not, how do you make sure all of that knowledge will travel across an organization and across time? So across an organization, many companies are thousands of people. How do you make sure that if they weren't on the email chain or they weren't in the presentation that they can still hear about what you're learning and then across time where future generations of the company can go back in time and see what have people tried, what worked? What's the story of this company? If you've figured out institutional knowledge and knowledge-based indexing, then you have a great answer for that.

The problem that happens if you don't do a good job of codifying what you know and what you've accomplished is you just repeat yourself a lot, where you look at a thing and wonder, "Oh, this onboarding flow clearly needs some work. I have some obvious ideas for how to improve it." Little do you know that some PM three years ago tried the exact same things. And while there's definitely some value in repeating, at least look at those results because one of the things you figure out with experimentation is that as you try out these product changes, some things matter a lot more than you think, and some things matter a lot less than you think. So if you can speed ahead with that type of knowledge, it helps you run better experiments. It helps you have a better shot at reaching impact earlier, to not waste time with things that were discredited and to get more confidence that, "Oh, wow, no one has ever tried this idea. I actually think it'll be potentially huge."

Suppose you're a PM joining Airbnb and you want to make your mark quickly, 'cause everyone knows you have to get some wins early. A great obvious step to do would be to say, "Let's find out what has worked in the past," not just the experiments themselves, but even just how they were executed, what metrics they looked for, how people understood it was successful. And if you did that, you would quickly realize that experiments at Airbnb typically succeeded by one of three things. So first is, they would decrease friction in some way. This would be things like autofilling, default answers to drop-downs and text boxes to speed up things. Another thing you'd realize is retaining focus, so making sure that you keep people in the funnel, driving down the funnel. Not trying to widen the aperture of what to look at, but actually to narrow it and make a decision. The third thing you'd realize is that you want to continuously make them excited.

Say, "If you signed up for Airbnb as a host, you're going to make this much money," and remind them about that all the time. The way you'd know it is by looking at the experiments and see most impactful experiment, opening a listing in a new tab. That's a way to keep your focus as you look down a listing and retain your search context. Another one was removing all this large of text content that was part of looking at neighborhoods. That was another thing that makes you lose focus, where instead of diving deeper into listings and booking them, suddenly you're reading this giant block of content about some neighborhood. Then a session that might've taken 10 minutes suddenly takes 30. You'd also realize that one of the best ways to get more listings on the marketplace, which is one of the biggest drivers of growth, is to continuously remind hosts of how much money they make if you sign up for Airbnb.

And so you just come up with these themes of success if you just look through the catalog of what have been the what most impactful experiments we've run before and what have we tried, what have we not tried? So that's the thing you get once you have a great system of cataloging knowledge. Usually, there's not really any centralized way to find all the experiments that have been run. You arrive at some organization and they've been running experiments for four or five years, but you have no way of actually knowing what's been run before except what's in people's heads, and it's worth asking why is that? Surely, it's so easy to just create a little Notion Hub or a Google Drive folder and put stuff into there. It's a lot of work to write reports. Every PM knows this. It's something they just take on to pain themselves. If you were to try to write up an experiment or even any product change for a broader audience, you're going to be pulling in all sorts of data of why you did it and what you saw.

You're going to have to get all these creatives, these screenshots together because it's hard to voice over what exactly you did. You have to show people what you did. And then you get at the end and you align on a decision and you just know there's going to be a lot of discussion that's going on there, and you want to take those discussion to tools that are a little more fluid, things like Slack, things where conversations naturally happen. So there's this big pile of onerous work, and because it's so much, you end up going with the tools of least resistance, which is a Slack conversation, ab email thread, maybe a Google Doc that you can comment on. But the problem is that once you do any of those things, it's just gone forever. The email thread obviously goes into the archives. The Slack conversation is buried unless everything else everyone talks about with Slack. And so the work of curating all these reports never leads to this idealized Notion Hub Google Doc folder that would let you actually understand what's been working before and what's not.

Of course, the PM who ran the experiment two years ago is probably not thinking of you two years later when they are actually deciding what to do. What is it about knowledge bases and experiments and how can you make it work? If you're thinking of a good experiment report, you need the justification. You need the statistical considerations like the power analysis, how long did you run? What sort of metrics are you going to have to look at? What sort of deeper divers are you going to look at? And then you need to document your decision and why. It could be, for example, if you decided to ship something, is it because it was a huge success? Or is it because, "Hey, we had to make a decision and we didn't feel like ripping the code out?" So just documenting the whole process is important for understanding was this experiment truly a success or not? If you can make that report creation really fast, then you have a workflow that has gravity that everyone can do, and then it's just a matter of putting it in one place and making it searchable.

That one place could be a Google Drive or it could be a dedicated SaaS app, but if you're able to make a workflow that's so low friction that writing reports that are widely consumable and viral happens, then it's much easier to then curate it and index it for the future. So to give everyone context, we built this product called the Knowledge Repo Airbnb, and the idea was to take all the strategic analyses that analysts were doing, these deep dives that were happening in Jupyter Notebooks and R Markdowns and to try to turn that into something widely consumable and indexable. A lot of this was spurred on by problems that every organization faces. So one, Airbnb was doubling. It was a crazy growth year, and there was all these new people who needed to get oriented around, how do I do work? What tables do I pull? How do we analyze these things, and who do I talk to to get skilled in an area?

So you have this large pile of people who just needed to be onboarded and accelerated in their careers. And then separately, there was a question of, how do we create trustworthy results that can be built on top of? Trustworthy results being like, how do I know the analyst didn't make a mistake? How can I understand there's been any QA process whatsoever? Things that peer review gets you, how can you build that into the process? And then extendable in that suppose someone reads your report and they actually want to dive a click deeper. They want to say, "Okay, but what about this if I split it by this new segment? How does it look specifically on mobile versus desktop, or specifically in East Asia versus anywhere else?" If you want to build on previous work, you need a very rich representation of what exactly you did and what choices and methodology did he take.

So experimentation has the exact same problem as what we saw in the Knowledge Repo where you probably are hiring new people all the time. Those people need to understand what's been run, what's not been run and how exactly did people attack it? So that's problem number one. How can you get new people ramped up and accelerated on the path to impact? The second problem is suppose they want to look at some experiment result and figure out was there anything that wasn't fully understood there? For example, I'm now hired as the business travel product manager at Airbnb. I need to go make a ton of impact on business travel. I introduced a completely novel segment that we're going to call business travel, and I now have to go into the past and see what drove that success for that segment or not.

If you have the knowledge base, then you can add a new segment to those things and extend their analysis and find out, sure, there was a pile of successful experiments, but which ones were successful for business travel specifically? So being able to extend experiment results is a huge part of this process. So what I would say the problems we solve in the Airbnb Knowledge Repo are the exact same things all the experiment teams have to solve, which is, how as fast as possible can you come up with the winning ideas, which it means going through what's been tried, what's been not? How can you trust the results? So for example, not run into the problem where the product idea was good, but the experiment methodology was bad. And then third, how can you extend results once you want to dive a little bit deeper? Maybe another team made a hasty unresearched choice and you want to go back and do some segmentations, make some funnels and figure out if there's some new idea that was undetected.

One of the questions is around these practices of writing up reports and creating this very tidy consistent knowledge base. Is that a technical problem or a cultural problem? I actually am going to put a bit of a hot take on here and say that I actually think it's completely a technological problem. People think that there's a cultural gap because people aren't doing it, and it's like, why don't people value it enough? It's usually the friction in doing so. If you think of an organization, leaders do want to know what products were shipped and which ones worked, and so there's already organically a poll to make this type of communication. Now, they might ask in a way that the communication ends up in a big Slack thread, or it might end up in some Google Doc presentation or just some completely bespoke way. And what that does is it solves the immediate pressing concern that a leader wants to know what just happened.

The technological problem is that that immediate work never translates into anything long term, anything durable. And that purely is just how do you make the easiest path to communicate to your executive, one that automatically siphons into a knowledge base? Why should it be a separate process to say, "I'm writing for my CPO, and I'm also writing for institutional knowledge?" The right technology solution actually just makes both happen in the fastest way possible. I think it's not hard to convince people that it's an important thing, but it's hard to get people on board because it's a change of behavior and it seems like additive work, right? It's just really onerous to make the report that would be consumable if you were not directly on the team. The problem with the world today is that one, it takes them up a lot of time to do it, but two, their work creating those reports isn't going anywhere that is communicable to the future. There's no way to take those reports and make them discoverable again.

So really, it's this small, almost seemingly trivial technological change, which is to say, take the work PMs already have to do, make it 10X faster and make it indexable automatically. It's just already going to show up in your knowledge base. To me, that's really the problem of knowledge-based software is how do you create a workflow that's so magnetic in how low friction and fast it is and the quality of reports you get at the end? Institutional knowledge ends up being a big part of any experiment program because what gets people excited about experiments are the ideas, are the things you want to test. This experiment worked really well for this other team, for this other company, I bet it would work really well here. And if I'm right, I get this rigorous measurement to show off to the rest of the org that we had a ton of impact. But again, you can see here's some trends around what worked and what didn't.

Here's some failure modes that I can avoid. Here are some opportunities that's never been mined. A nice benefit as well is that once you have all this stuff collected, it just raises the profile of an experiment program altogether. It makes it seem accessible. If some team can go and see, "Here are five other teams that ran experiments that were really successful," I know those people. I can message them. I can ask questions about what made them excited about it, what didn't. It makes experimentation seem like a very everyday thing that everyone does once it's out there in the open and it's highly visual and it's highly viral around the org. So I think getting everything highly visible and disseminated has a bunch of benefits besides just the problem-solving one of what's been right and what's not, it creates a cultural norm where everyone's expected to be doing these tests. Everyone's expected to try out ideas and see what works.

The key first step if you're trying to build a knowledge base is there should be a workflow for actually writing this stuff up. Sometimes people, they look at these experiment results or they generate this conclusion, but it's just a bunch of Slack comments. It's some email threads here and there. It's never consolidated into any sort of reports. And if there's nothing to even index, there's no piece of knowledge to store, then you're going to get nowhere, right? So the first step is to say, "How can we in the lowest friction possible way, come up with a system where people are going to document what they did in what decisions they made and why they did it?" Emphasis is on low friction. Obviously, it's very hard to get someone to justify extra work in their calendar for something that they won't see an immediate benefit to.

But usually there's some moments in which people have to socialize results, especially in a larger organization. Usually, some product team has to do a report for the rest of the product team. So that's the starting point is to look at that workflow. And then once you're doing that workflow, how can you do it in a way that can be cataloged for the future? For example, if you create some sort of Google Doc, one of the frustrating things that Google Drive tends to not be good at, indexing knowledge. That's one of the ironies of the world, that the great indexer of the internet is actually bad at indexing Google Drive files. So can you instead turn it into something that has some sort of web app layer on top, some sort of search ability, some tagging, the ability to actually parse out all this content as it comes through? Then once you have the ability to index and search, how can you create these more exploratory views? How can I see the last five great experiments that happened?

This foundation is to start off by actually making reports at all. The next layer is to then create the ability to index those reports and search them. And then a nice to have is to say how can you also just raise the profile of building knowledge by creating some really compelling knowledge-based views, metadata views, meta analysis views? I think that a great easy step is to just try to get people consolidated on one workflow of documenting reports. So that one workflow, it's probably going to be a Google Doc or a Notion Doc or a presentation, just say, at the very least, we're all married up with a very repeatable template that everyone can engage with, and it ends up in a big pile. That has a bunch of benefits. So one, it builds the muscle of actually writing these reports. It creates a consistent pattern of these reports. One of the things about some low context stakeholder like a CPO is that they're going to read so many of these things.

And if they have to reorient themselves around every new report structure, then it gets very disorienting. If you can create a standardized way to write these reports and a place to put them all in one pile, that at least gives you a lot to work with. Now you still need to figure out how to index them, how to organize them, how to disseminate them. But that's a very logical starting point. The key thing to mention is it just has to be low friction. If you think of everyone engaged with shipping these products, they want to show a lot of impact. They want to brag about it, but everyone is very time scarce. They need to move on to the next thing. This is one of the great challenges of building knowledge bases, learning about what you did, that it's not going to be appealing enough to make people spend a lot of time doing work that they would not do otherwise. How can you just take the work they're already doing, which is disseminating to their immediate stakeholders, but carry it over into something that will work for future generations?

Maybe the best way to explain institutional knowledge is to take some concrete examples. So when I was at Webflow, there was a project called Prebuilt Layouts that it was a pretty cool project. I thought it was well researched. The qualitative data that justified the project was pretty sound. I was previously a Webflow user. I thought it was well thought out, and there was a lot of excitement around this project. They ran the experiments a few months before I joined, and it didn't show any impacts at all, and everyone ended up disappointed. People had moved on from the project. The team was, I wouldn't say disbanded, but refocused elsewhere, and it had just completely lost momentum. So when I joined, I looked at the actual experiment results and found a bunch of things. So one, I came to understand why the project was built, what the research was, how much it was justified. But two, when I actually looked at the experiment results, you could tell that the experiment was not run well.

A great example of it was if you actually did a power analysis to figure out how long to run a test, you'd figure out this experiment needed, I don't know, two months or something to get signal, and they ran it for a few days. So it was just nowhere close to enough data to actually be able to find a result. And so you as a PM joining Webflow, you might've arrived and thought there should no point in investing in this type of area because someone already tried it and it had bad juju around the org. But if you had the proper knowledge base, you could see that the qual data justifying experiment was pretty good and the experiment itself was shoddy. It needed to be rerun. That's the type of thing you get when you do knowledge-base indexing where institutional knowledge can be interrogated. You can understand what needs to be retested, what does not need to be retested. When we talk about knowledge bases, we're really talking about just a very painful workflow for product managers.

Curating these reports and trying to parse through the reports is really just time onerous in a way that most PMs are left wondering, "Why can't I ask basic questions of what has been tried and what has not been tried?" They don't usually connect that sort of annoying moment where you can't understand history with the similarly annoying moment when you get to the end of a presentation, and they have to block off multiple days of the calendar to write some giant report, create these screenshots, be able to make sure the language is tight, create something that they can present at a product team meeting. These problems are actually the same problem is that the pain of creating reports is the same pain of understanding what's happened, and both problems are woefully underserved, right? Everyone always starts from scratch writing up things. If they want to draw in experiment metrics or any deeper dives, they have to go through some data analyst, wait weeks for them to turn around the result and then screenshot it back into the thing.

And then the data analyst says, "Oh, I made a mistake," and I have to go back and remember to get the fresh screenshot into my report. So the amount of herding of cats required to actually curate a report is so annoying and onerous that by the time you get to doing it, you're not going to think about, "How can I do this in a way that solves the future PM's problem of understanding what happened and what didn't?" So this is all just one big block of frustration for PMs around, "How can I avoid spending hours days of my time writing up this stuff while also being empowered myself to say, 'How can I benefit from other people writing great reports?'" When you think of experimentation, I think the most frustrating thing for PMs and product teams, it's so error-prone. This is a problem of not having real-time metrics. This is a problem of not being able to self-serve basic analytical diagnostic questions.

Not having good QA processes sometimes with an experiment like the QA doesn't show up until you've launched it to a very wide audience, one major block of pain is just, "How can I make the execution of an experiment go smoother?" And this is really where purposeful technology really helps. The other piece of pain is once you've ran the experiment, how do you know what happened? If you think of the amount of questions and scrutiny that a PM faces when they try to say, "I did this thing and did it work or not," to a product team. They face so many things around how did it compare for power users versus casual users, new versus existing? People from organic channels versus non-organic channels? Oh, this metric went down. I know it's not your main metric, but that seems concerning. Can you dive into that and figure out is there some explanation why there's this weird drop in the funnel? Just making decisions invites so much scrutiny that eventually just gets bottlenecked by analytics resources.

The PM is waiting on some analyst who is just increasingly annoyed by having this laundry list of questions they have to answer. So this is as much true around making decisions and writing the report about it, how can whoever has to do that work be completely unblocked to answer every single deeper dive question that might happen? How can the process of deep diving naturally be tethered and connected to the report so that here's my overall result? Here's what I looked into and here's this large appendix of deeper dives I did, if anyone wants to look at it. So the two big pains of experimentation, how do you make setup really easy and error-free? How do you make a clean operation? And two, when I want to understand what happens, how can that be fast, rich and well communicated? This is an obvious AI question here, which is, how is product development going to change when you have these LLM bots, which can parse language really well?

And this is a place where knowledge bases are just going to get increasingly important. Because it used to be the case that a knowledge base was just a large pile of previously historic reports that you could run a search engine on top of. We're now entering a world where you're writing these reports, not just for your immediate audience, but also for your institutional knowledge AI bot that is going to go all over your report and then be able to ask novel questions of it. So now a new PM can actually ask an AI, "Hey, has anyone tried anything in New Year's onboarding with a focus on internationalization?" With that, this AI bot can actually figure out, "Here are their experiments on internationalization. Here's the ones that worked. Here are the ones that were successful." It's going to be interesting to see how institutional knowledge goes from something that is just a pile of documents to something that is actually you can talk with almost like the smartest, most tenured employee in the entire company.

Table of contents

Ready for a 360° experimentation platform?
Turn blind launches into trustworthy experiments
See Eppo in Action

Ready to go from knowledge to action?

Talk to our team of experts and see why companies like Twitch, DraftKings, and Perplexity use Eppo to power experimentation for every team.
Get a demo