EPISODES CONTACT

Effective and Scalable Machine Learning, with Opendoor Co-Founder Ian Wong

Ep 7

Apr 05, 2023 • 46 min
0:00
/
0:00
ABOUT THIS EPISODE
Ian Wong, co-founder of Opendoor, shares what he’s learned about machine learning and data science in his work as CTO of Opendoor and helping detect fraud at Square. We talk about the differences between descriptive and predictive ML, approaches to human-in-the-loop prediction, setting up a data science org to deliver real business impact, why so many internal tooling projects fail, and how leaders should be dividing their attention between top-level strategy and the details that really matter.
TRANSCRIPT

Allen: Welcome to It Shipped That Way, where we talk to product leaders about the lessons they’ve learned helping build great products and teams. I’m Allen Pike. Joining us today is Ian Wong. Ian co-founded Opendoor and served as the CTO for over eight years from its inception through its IPO. Before that, he was a data scientist at Square working on the thorny problem of fraud detection. Welcome, Ian.

Ian: Thanks Allen. Really glad to be here.

Allen: Yeah, thanks for making the time. I’m really excited to dig into some of the topics that we’ve got.

Ian: Let’s do it.

Allen: So today we want to talk about machine learning, data science, what it means to build high impact internal tools and some of the engineering leadership practices that help make all those things actually happen in a growing org. We can have grand ideas, but then actually implementing them with our team. But first, I want to give folks some context on your journey so far in the industry. As I understand it, you were initially considering a career in academia back in the origin story.

Ian: Yeah. So my journey started… So I went to school at Stanford and I did my undergrad and grad school there, and I did electrical engineering undergrad. And I still remember waking up one day in junior year or senior year realizing, “You know what? I don’t like circuits. I don’t like to code in analog. What the heck am I doing?” And I had this soul-searching moment. And then I realized, actually the part I really liked about engineering was the math behind it. And fortunately, the math behind Double E translated very naturally to machine learning. And so I decided to pursue a PhD studying in the statistics department and I was doing that for a couple of years. And then I had another realization in grad school where I woke up one day, “I was like, oh my goodness, no one’s ever going to read any of the papers I read, what the heck am I doing here? It’s all mathematical statistics.” And then I also realized maybe from my time in Double E, that there are times and really eras where the industry’s leading academia in certain areas. And Double E in the ’80s, you want to be doing Double E in industry like in Intel. In the same way, when I interned at Facebook, I was like, “Holy crap, why am I studying all these theorems about conversions or how large distributions would move over time when I could just query how people are feeling today?” I can literally just-

Allen: The distribution’s there.

Ian: Yeah, I can just count the number of smiley faces that are on Facebook today and that was incredible. So I decided to leave grad school and one thing led to another and I started at Square as their first data scientist.

Allen: Nice. Well, at first, I realized that you were in Data Science Square early on. I didn’t realize that you were their first data scientist. That’s pretty remarkable given how much as I understand it, Square ended up needing pretty substantial data science in order to make the business. Fraud is a problem for any company that touches money, but that seems like a pretty important function.

Ian: Yeah, it’s funny because at that time, Square was maybe a year and a half old, so still a very young company. And I remember I met Keith Rabois, who was the CEO of Square, and he was like, “Ian, you got to come in and we need a statistician to help us solve fraud.” I was like, “Sure, I’m game.” I really loved the team. I love chatting with Jack and Keith and others came in, showed up on day one. I was like, “Show me the data. I’m ready to build some models.”

Allen: Let’s go.

Ian: I’m hot off my half a PhD, wrangled some R, let’s build the model. And I remember looking at the data and I was like, “What the hell’s going on? You don’t have any fraud.” Because the company was so small, that they had barely any transactions, and so they had 50 cases suspected fraud. It was hardly meaningful.

Allen: You’re imagining Facebook scale something.

Ian: Yeah, this is going to be super complicated. We need to build all these algorithms that are bespoke to a domain… No, no, what these guys need is just a way to build internal tools, build a way to just be able to process cases efficiently. And so in hindsight, it was a huge blessing in disguise because up to that point, I never really did software engineering. And so going to a startup like Square when it was 40 or 50 people, I had to build internal tools, then all the data warehouse tech, and then eventually work my way up to the ML stack. But it was really helpful because it gave me a really deep grounding of the domain, which I think is a really core skill of a data scientist. So you have to be able to walk in the shoes of your customer or the phenomenon that you’re trying to model, and eventually build that ladder of abstraction all the way to the point where I can predict fraud or I can understand how much credit I can assign a given merchant. It was a blessing in disguise, but it was definitely a bit of a shock when I first showed up on the door.

Allen: And you touched on something we’ll probably talk about a little bit later once we loop through your path so far, but there there’s a big difference in between, okay, yeah, I’m dating a lot of science, look at all the data science that’s happening. Here’s a really complicated spreadsheet, and then actually bringing some impact to the business that lets people solve problems and make money and retain customers and all those things.

Ian: Yep, absolutely. Yeah, for sure.

Allen: So you went through this time at Square and then ended up founding Opendoor again, probably presenting maybe a similar path, where on day one of Opendoor, you don’t have millions of transactions to parse.

Ian: Opendoor is a very different experience. So Keith reached back out after I had left Square, and he had gone together with Eric Wu and JD Ross, my other co-founders, and said, “Look, residential real estate is one of the largest markets in the United States. It’s the single largest asset that most people own, and it’s a very meaningful one, both for emotional and personal reasons and financial reasons. Yet if you think about the experience of buying and selling a home, it’s still super backwards.” You and I can have a million different ways to transact or transfer money to each other or buy and sell electronics or cars or what have you, when it comes to housing, it’s like a emotional rollercoaster is actually how most people describe it.

Allen: It’s objectively bad.

Ian: Objectively bad. And so the question is, how do you really streamline it? And it was just a really audacious idea. How can you make buying and selling a home just a few clicks? That would be really cool, if people don’t think of a home as a liability, but truly an asset that can empower them to pursue whatever they want to do in life. So anyway, started off at Opendoor, this is back in 2014 with just an idea and a dream and a small team, and slowly built that up. So on day one, it’s different because at Square on day one, there was literally zero transactions, so you have to build the transaction, then have that data set and then train models. But Opendoor, we were trying to help people sell their home instantly to us. So we’ll literally buy the home from them.

Allen: So you need to have some model before you buy even the first home?

Ian: Yeah, exactly. But fortunately, there’s the MLS and there’s a lot of data available on transactions, so you can actually train a model off that, but you still have to think a lot about human in the loop systems, how do you build the tools that integrate with the algorithms? I’m sure we’ll get into that. And also the system. People think that a lot of times in terms of, hey, there’s an algorithm that will do everything. But realize how the machines interact with the humans, and how do you actually have that governance layer that sits on top of everything to make sure the system works properly? It’s interesting because you can draw all these analogies in software, you know have CI/CD, and that’s part of the pipeline, and ultimately, you have usage stats for software and see if your users are using your software and if there are bugs. Deploying ML software at scale has a different set of challenges because it’s not as black and white like, does this button work or not? Like, “Oh, we misvalued this house,” or, “Oh, we let this fraudster get away with 20, 30K of transactions.” That’s pretty bad, but it’s not black and white a lot of times.

Allen: Sometimes it’s even we may have let this fraudster get away with things or we blocked someone who might have been a fraudster or we’re not sure. I think almost everybody listening has probably been blocked by an algorithm that thought they were doing fraud, but it was actually they just were on a different computer or it was a new card or something like that. So then that’s exactly to your point, a gray area. Is the algorithm working?

Ian: Yeah, exactly. In the moment, it’s very hard to tell.

Allen: So let’s dig into that a little bit. So to your point that you’re making there, there’s a big variation in different approaches for data science and you can do data science without necessarily delivering a lot of value. I talk to some leaders that staff up and really swear by an incredible business impact that their data science and machine learning function is delivering in their org, and then talk to other leaders and they wring their hands and they’re like, “Yeah, we have some really smart data science people and they produce some spreadsheets and I’m really proud of the work that they’ve done. But maybe we’re not necessarily turning that into either maybe we’re not fully confident about the input data or we’re not really fully confident about turning that into business decisions or tools that really make as big of an impact as maybe it could.” I guess my question is, in your mind, what are the big inputs that make that difference and between a really effective data science machine learning function in trying to build impact and maybe a less effective one?

Ian: And I think that’s a really interesting question with all so many dimensions to it, but maybe to be super reductive about how to approach that question because it’s so broad.

Allen: It’s a good place to start.

Ian: Yeah, it’s a good place to start. To be super reductive, the word data science is very vague and so let’s try to maybe put a bit of structure behind it.

Allen: Sure.

Ian: At a very high level, I think you can broadly categorize into two buckets. One is more of a predictive function and the other is more of a descriptive function. So when you think of predictive, and they’re not mutually exclusive and all that, but just to have some structure around this conversation, let’s just break it into those pieces. So the predictive piece, you can imagine to be recommender systems like your Facebook feed or your Pinterest feed or whatnot, fraud detection or valuing houses. These are predictive systems where the output is a prediction or an inference and it interacts with the real world, and errors are bad. So you have an objective function where there’s a loss associated with an error, and that error, hopefully, is meaningful to your business such that you want to minimize that error. So that’s the predictive part. Now the descriptive part, I think it’s very native to most companies. It’s the analytics function. It’s being able to understand both descriptively, where your business is going, how your users are interacting with your systems, and all the way to experimentation, trying to figure out how to do A/B testing at scale and whatnot. So the first one, I think it’s more around driving inferences so that you know can accelerate your business in some dimension. The other is generally about retrieving insights so that you can accelerate your business, but differently, by better understanding how your business is operating or how your product’s getting adopted and so forth.

Allen: And I love that distinction and I’ve never heard that. Maybe that’s a common thing and I’m just not deep enough into data science to have heard that, but I love that distinction and that’s really useful for thinking. And so from your background, your experience, a lot of your work has been on the predictive side, and that’s also, I guess a lot of the developments that are happening. We’re talking about large language models and things like that, is very much on that, we’re predicting the next token or whatever.

Ian: Exactly.

Allen: And so that’s the kind of stuff that is more likely to be impacting the ability for you to even provide a product. Opendoor couldn’t even exist without predictive things, ChatGPT couldn’t even exist as opposed to everybody’s going to do data analytics on their usage, and that’s just baseline.

Ian: But I want to make that distinction because I think your question was, how does a company realize value from data science?

Allen: Mm-hmm.

Ian: I think sometimes, I find that a lot of companies are confused about data science, what is it supposed to be? And in fact, the data scientists are confused as to which moat they’re in. Am I more doing analytics or am I doing more of the inference work? And it’s again, not mutually exclusive, but I think it’s good to have a true alignment up and down in the company, as to what you want because I find actually there’s a trap where an ambitious data scientist thinks the predictive stuff is all the sexy stuff, but actually, the company wants them to do descriptive work. And so you have this impedance mismatch where you have, again, a data scientist that wants to do A, but a company wants them to do B, and then people just burn out. There’s a grinding of the gears. So that’s actually the first thing that I think a organization needs to realize. What do I need out of a data scientific function? Is it that I really want to be able to glean great insights for my complicated data sets and accelerate the rate of experimentation, for instance? Or is it, “Hey, I’ve got this business critical predictive function that I need people to shore up and make that to be really excellent.” So that’s the first distinction. And then from there, then it’s about finding both the systems architecture and the people to make that happen. And I’m happy to go into each, however you’d like to take it.

Allen: Yeah, well I’d love to go into each. Maybe some interesting context for us would be obviously you started Opendoor and it was you, and then you grew to the point where it’s a 2,500 person company or something like that. What did the org and how did the data science and machine learning. How did that look organizationally at its peak? It’s maybe interesting context.

Ian: At the peak, our engine data science teams were call it 150 to 200, in that ballpark. Opendoor, the way to think about the company is I always have this acronym where you have four different main capabilities that we are trying to build at the company. And so as a CTO, I’m constantly thinking about capabilities, what are the sets of functionalities that can enable the business to move forward? And so they’re generally aligned in four areas. One is consumer experiences, second is operations… Not DevOps, but literally we’ve got hundreds of people, boots on ground, folks that are visiting houses or folks that are servicing our customers over a phone call. How do you enable those operators to be truly effective? The third is pricing, and the fourth is infrastructure. So the engineers and data scientists are spread across these four different pillars. Some of them are more descriptive and others are more predictive. So for instance, in consumer, a lot of times we are trying to run experiments and understand how customers are interacting with our software. So in those areas, you might see more data scientists that are skilled in experimentation and understanding causality and understanding what the data’s saying. Whereas in the pricing side, we started off building relatively classical models like your linear aggressions and your random forests. But over time, we hopped on a deep learning train and built pretty custom deep learning models for our use case. So there, you’ll find more of the “ML experts” who are building predictive software. So even within the company, depending on the pillars of capabilities that you’re building, you’ll see a different mix of data scientists within each.

Allen: And I imagine that impacts how you’re writing job descriptions and when you’re putting out an ad, you’re opening a position and you’re titling it and all that kind of stuff, then that’s going to impact who’s going to be applying, how you’re going to evaluate those people. Maybe even having different competency matrices and stuff like that.

Ian: Yeah. Or there’s always this classical problem of a matrix organization. So you want to have it be, and I think and Grove wrote the definitive chapter on this in High Output Management, what are the pros and cons of matrix management? Ideally, everything’s aligned to your business unit, actually. Ideally, if you have a boss who can know all the disciplines very well, you want all your functions to report up to that boss because you can move super quickly, you’ll be aligned to the customer.

Allen: Yeah, ideally.

Ian: Ideally.

Allen: If you can find a boss that actually understands the work that you’re doing, that also can run the business unit.

Ian: Yeah, that’s perfect. But realistically, that’s really hard to achieve. So realistically, you have matrix management, meaning you have functional experts, like a data science expert, you have a VP of DS, you have a VP of ENG who have that skillset or have that craft of engineering or data science and can mentor their reports on that craft. So realistically, the way we organize this more of a matrix function where we had business unit leaders that collaborate very actively with myself as a CTO, and more of a functional lead and I have functional leaders underneath myself and that’s how we structured the team.

Allen: Okay, yeah, that makes a lot of sense. Going one step back to your point of there’s two different big inputs into doing all this stuff well. One is the organizational down to the code and then the other is the people side of it. Let’s touch a little bit more on the organizational and actually applying it stuff. And then we’re going to talk about people because people in leadership is one of the key topics on this show, so we want to make sure we don’t leave that out too.

Ian: So on in terms of how to actually apply these systems at scale. On the predictive side, there are a couple things I’ve learned from building things both at Square and Opendoor, which is it’s really important, like I alluded to earlier, to think about start from the end and work your way backwards from it. So the end is I need to be able to predict fraud or control fraud. So my KPI at Square was fraud loss. It’s a business KPI, it’s not like some engineering KPI. I always work backwards from, actually, what is the business equation? Meaning Square is successful if our customer acquisition cost is low. We have high profitably from each transaction, which means that we need to minimize fraud loss. At Opendoor, we need to make sure we turn a profit on a unit level, which means we can’t buy high and sell low, right?

Allen: Yes.

Ian: Or We should minimize that, but we don’t want to go the other way too much because look, customers are savvy. This is the single largest transaction that people do and if you give them a low price, they will just not accept. So our goal is to be as close to accuracy as we can in as many cases as possible. So anyway, those are the business KPIs, and that then translates very naturally to an engineering objective. We minimize fraud loss or minimize error of a prediction of the value of the house. Now, one simply doesn’t just deploy an algorithm that is trained and has those objectives because frankly, an algorithm might have flaws, an algorithm might whiff on things that are obvious to a human. So that’s a very obvious case where you want to have a tool that maybe allows the humans to inspect the outputs of an algorithm before you let it go into the wild, so to speak.

Allen: That makes a lot of sense, but I think a lot of users assume that doesn’t happen. A lot of users assume that “the algorithm”, which is just a black box, just outputs a yes or no on the fraud or outputs a price, and then people only look at it if there’s a complaint.

Ian: There’s a scale to it. So it depends on the volume too. For instance, personalizing your newsfeed at Facebook, there’s no way a human can be really in that loop because that round trip’s probably less than 50 milliseconds or whatever. But at Square, we were looking at next day settlement. And so that gave us some time. Now, we couldn’t review every single transaction that was coming through. And at Opendoor, we were doing again, next day offers. So again, that allowed us some time, but for a variety of reasons, you actually don’t want humans to do all of it.

Allen: Yeah, it’s expensive.

Ian: Number one and in Square’s case, it was just not feasible. Right, expensive. Number two, actually more perniciously, there are a lot of human biases. And so there comes to a point when actually, algorithms can handle things better than a human can. And so you actually go through this whole evolution process where initially, you might have a human that is really armed with great tools and then that tool becomes more of a case review tool or that tool becomes actually much more integrated with your algorithm. So for instance, a tool can help you review a case of suspected fraud or it can help you value a house. And that tool should be self-sustaining. But then over time, you can have the tool be integrated in the algorithm in the sense of, “Hey, the algo is saying that this is a likely fraudster because of these reasons.” Or the tool can be like, “This is the value of the home, we think, because…” And the human’s eyeballing and saying, “Well, do I agree? Do I not agree? Why do I not agree?” And if I don’t agree, then that becomes really interesting data to then actually then feedback back into the algorithm and you iterate on that. And then at some point, your algorithm actually becomes a shadow agent. Your algorithm’s making a decision in the same way a human’s making a decision and you can actually evaluate them just like you would evaluate any other human operator. So that becomes a shadow agent that can actually review every single case or value every single home. And then you can actually start to see. What’s interesting is that you don’t need the algorithm to be perfect, but it just needs to be at least as good as a human being.

Allen: Yeah. For a given business case, there is a minimum bar of how good it needs to be in order for… Because if neither humans nor algorithms can catch the fraud, then Square still goes out of business.

Ian: Yeah, exactly.

Allen: But as soon as it’s better than a human, or actually, probably roughly equal to a human because it’s cheaper than the human probably. Then that’s when you get into humans probably, I assume, doing more verification, spot checking unusual cases, new developments rather than it being just routine, a certain percentage of transactions.

Ian: Exactly. So in Square’s case, I believe some vast majority of cases are now auto reviewed, leaving only the real insolent ones for human operators to look at. And the same way at Opendoor, it’s a little different at Opendoor because it’s challenging to value real estate. It’s very hard. It’s much harder problem than fraud, frankly.

Allen: There’re more vibes in the real estate-

Ian: Yeah, exactly. What’s a curb appeal? I don’t know. Depends on who you ask. And so there’s a lot of input data verification. At Opendoor’s case, it becomes actually a much more challenging system to build because you actually have to have a human in the loop at different stages of the pipeline. It’s not just one decision. Actually, is this home hard to value? That actually becomes almost like a model in itself. Not quite, but almost, right?

Allen: Sure.

Ian: And then you have to ask, well why is this home hard to value? What data is it missing? And so at that point, there’s almost a decision branch where you’re asking the humans to then give you data points so that you can continue to the next phase of inference. So there’s more going on. So I’ll put a pin on that, and I just want to touch briefly on the governance piece because I think that’s really important.

Allen: Yeah.

Ian: What I mean by that is in many companies, there are business reviews, there’s monthly business review or sometimes weekly business review. Think of the metrics that are happening there, that needs to be part of how you operate these models and how you operate these processes because these processes are part of a business process. And so ultimately every week, we need to be held accountable to how much fraud is actually happening relative to the false positives, which is how many merchants do we unfairly freeze their accounts because that’s a terrible customer experience? In the same way at Opendoor, what was our profit and loss statement on the cohorts of houses that we just acquired or just resold relative to what’s our conversion? How many customers are actually saying yes to our offers? So there’s actually a governance piece that looks at the performance at a portfolio level or at a cohort level over time. And then you need to understand, are you dialing things correctly? And sometimes, you need to make some macro calls like, how much false positive am I willing to allow given how strong the model is today? Similarly, how much money are we willing to put at risk at Opendoor given how the model’s behaving today, and given the uncertainty of the future. So there’s actually a decision layer that is embedded above the entire system. There are these knobs that you have to tune.

Allen: And I imagine, sometimes you’ll have your weekly or monthly review and say, “Well, this metric has gotten substantially worse,” and then ensues a discussion about is it worse because the algorithm got worse or is it worse because the fraudsters got better this week? Which is probably three quarters of the time the problem or if not more, right?

Ian: Yeah. Yes. Housing and fraud, either they’re not static systems and they’re always dynamic. Unfortunately in fraud’s case, there are these fraud rings can crop up, they can find and exploit and try to come after you. And housing is less adversarial in the sense of there are people actively trying to steal money, but more like, man, the market is just volatile.

Allen: Well, I assume you get at least some adverse selection from a tool like Opendoor where if you just offer… There’s a whole bunch of industries and examples where this dynamic happens where if you make random offers with a random error in them, then people will take the ones that you accidentally undervalued, right?

Ian: Yup.

Allen: And those will be more likely to be accepted, which is bad. And so you think, “Oh, well our error is normally distributed and so our loss should be normally distributed.” It’s like, “Nope.” Because they’re going to take the times where you’re like, “Yeah, I’ll give you a $100,000 more than your house is worth.” And they’re going to be like, “Okay, accept.” And then you’ll get a disproportionate amount of the bad outcomes.

Ian: So that’s the lemons problem and that’s very real. And so you have to keep what we call spread, very low. We have to have some notion of spread that accounts for uncertainty and risk, but you have to minimize that otherwise you can actually exacerbate the adverse selection. And just to throw another term in here, we also look at, in credit cards, there’s something called reject inference, which is if I apply for a credit card and I get rejected, actually, the credit card company tries to track where I eventually end up.

Allen: Oh, interesting.

Ian: Because there’s a good chance that I’m actually not a terrible cardholder, and so you actually need to track the cases where you made a prediction to say negative and was that truly negative? And so that helps you calibrate the models a little bit better, but it’s still very hard.

Allen: On the fraud piece because you mentioned one of your metrics obviously if you’re running the business well is that you care if you’re rejecting transactions that really should go through. It’s really bad customer experience, it’s bad for the vendors. I’m a little bit curious though, how tractable is that? Because the number one thing that you’ll see, if anyone who’s worked with any system where they tried to reject either fraud or spam or anything, the people who are doing the fraud are really good at pretending they’re a legitimate person. They’ll be like, “No, you freezed my account and I was doing nothing out of the ordinary and the algorithm’s being mean to me and my wife died,” and whatever the story is. And then they post it. Sometimes you see a post on Hacker News or something where there’s some rant about how unfair Google or somewhere was, and probably three quarters of the time, it’s really actually unfair thing that’s been flagged. But then sometimes, people dig in and they’re like, “Wait a minute, you said this, which your story doesn’t add up.” And they’re like, “Well, I had this SEO scheme.” And then it’s like, “Hmm, your story’s getting fishy.”

Ian: No, it’s hard. You don’t have perfect data, so it’s hard to do perfect reject inference. So you have to have a really good operational team. You have to make sure they’re well-trained and they can actually suss at what’s going on. I’ve actually shadowed a ton of operators both at Square and at Opendoor, it’s always a trip, especially with the fraudsters. Sometimes we made a mistake, right? Okay, we quickly unfroze the account and resolved the issue. But a lot of times, there’s something going on and you dig in and you’re entering this crazy web of imagination and machinations that these fraudsters are coming up with on the spot and it’s like, “Whoa, this person’s clearly lying.”

Allen: And they’re professional liars.

Ian: Yeah, it’s bizarre the stories that they cook up, but you got to have a well-trained team that can help you suss it out. Again, it’s a system, right?

Allen: Yeah.

Ian: And then that team then gives you that data, in the same way with Opendoor, there’s no replacement to walking the houses yourself. God knows how many houses I’ve walked in the United States, just walking with our home estimators and hearing from them, what things they’re thinking about. Because a lot of the predictive side of machine learning, what you’re trying to do at the end of the day, and this is at least the classical take, is all this LLM stuff I am no expert in, and they seem to do magical things. But in the classical ML, a lot of times, what we’re trying to do is we’re trying to distill human intuition into features and you put these features then into model and you want to see which features stand the test of generalization, meaning stand the test of your loss function and can sustain itself to be in the model and to be predictive. And so that could mean, how do you distill curb appeal? The vibes of a house from photos? How do you do that? Yes, there’s computer vision stuff that you can apply, but a lot of times to your point, it’s the feeling. So how do you maybe use crowdsourcing and other methods to at least get some structured data from a very high dimensional data like a bunch of pixels?

Allen: And you could imagine that there are certain individuals who might rate the home having a high vibe score, but because they’re not the people who would buy that home, then that’s actually negatively correlated with the people who would buy that because it’s like, “Oh, this is a family home.” And so if young couples who don’t have any kids are like, “Yeah, the vibes are great on this home.” But then families are like, “Oh, everything’s white and then it’s going to get destroyed,” or whatever.

Ian: Yeah, exactly. So your features have to start to break that down a lot more. These are all fun challenges around how do you actually make ML happen? And a lot of it is having that intuition and being able to distill that into structured data that then feed the model and so on and so forth.

Allen: Yeah. That makes a lot of sense and I love that. One of my things on my list I want to talk to you about was internal tools. But we’ve already talked a fair amount about that because a lot of these tools and these systems and are internal. People think of an internal tool sometimes as just an intranet site you go to, but the tool is really what you’re talking about is internal systems that go from the data science to the data that’s getting put into it, to the operators that are helping do effectively realtime QA on the system, to it’s you implicitly, you didn’t say that you had set any rules around doing this, but you were multiple times, telling examples of going in and shadowing an operator, shadowing the people that were using the systems and putting data into the systems. I don’t know if you had any heuristics of how often you tried to do that or is that just something you got to do once in a while, but that’s part of the system too.

Ian: Yeah, I think it’s really important. At Opendoor, at least in pricing for quite a while, we actually required every data scientist to start off by valuing homes manually-

Allen: Nice. I love it.

Ian: With our local team or in-house team of experts because that’s how you develop intuition. I don’t care if you have a PhD from some fancy school, it doesn’t matter. That’s how you build that intuition. The one thing I do want to say about internal tools real quick though, is I’ve built a ton of internal tools in my time at the different companies and I’ve seen tools that really succeed and tools that fail, and there are a lot of failure moats that a company get into, and I just want to touch upon that for a quick minute.

Allen: Yeah, I’d love to because I’ve seen many, all the companies I’ve worked with and worked for and ran stuff, internal tools can easily become an albatross where it’s like, “No, no, no, we’ll just do this thing and then everyone will use it and it’s going to solve our problem. We’re going to put a whole bunch of resources into it.” And then years later, there’s some regret.

Ian: Yeah, and it’s interesting because in some ways, internal tools should be the easiest thing to build because your customers-

Allen: Because your users are right there.

Ian: Exactly. You’re sitting side by side with your users, so why isn’t that so many internal tools projects fail? And it’s always really puzzling when that happens. And I’ve had great experiences, again, I’ve built great internal tools that are part of that system in terms of human in the loop at Square and at Opendoor. And there are times when I’ve built systems that everyone hates, and it’s like, “What is going on?” I remember there was a time when at Opendoor, we built an Atlanta team that built tools because we had a pretty big operational team in Atlanta, so we thought it’d be a good idea to have engineering and product design, sit side by side with our operators.

Allen: I love it, in theory.

Ian: In theory, and it is great. Those folks have really strong empathy for the use case, except I remember one week, an engineer there almost quit, was super frustrated. We hopped on a call and was like, “Hey, what’s going on?” And I decided to fly to Atlanta and it’s pre-Covid and looked at what’s going on on the ground. Well, it turns out the engineer was working off a spec from the PM that was different from the EM, that was different from the operator, that was different from the process the central operations was prescribing.

Allen: Oh, no.

Ian: Normally in an engineering company, you have this matrix or this tension between product and design or product and engineering. But in internal tools, a lot of times, you have a triple matrix. Because you have products, engineering, and often, you have the operator or the business unit that you’re working with. So you have ops. And in some cases, those ops are further matrix like you got central ops and regional ops and whatnot. And so I find that actually a lot of times when internal tools succeed, almost always succeed in small teams. And yes, there’s this greenfield effect that you can feel very productive with, but I think actually more of the reason why internal tools succeed when the teams are small is because it’s easier to gain alignment on the process.

Allen: Yes, you can get buy-in from a small team, get them all in a room, is this the thing that we need? Rather than people talk about change management and all the resistance to change as your organization grows, you have hundreds of people and then even just uncovering, “Oh, there’s actually already yearly OKRs for this management team to do X, and if you remove X from the process, then that undermines all the work that they’ve been doing,” or whatever. You can bring it from a hundred things that just little weevils in this system when you’re trying to bring about change.

Ian: Yeah. So I think the issue is actually the process itself is not legible and aligned on. I actually think that’s a root cause of a lot of issues in internal tools is that sometimes, even the stakeholders don’t know what the process should be. I’ve worked with enough head of operational unit A or B, to know that actually a lot of times, these folks want to bring in entire systems like a CRM or an external tool. My tagline’s always build proprietary, buy commodity. I’m all for buying commodity stuff. I don’t want to spend any of my precious engineering cycles on anything that I can just buy off the shelf. But I realize actually a lot of times, that ends up being more work down the line because all these folks might want to buy a piece of software not because of something that software does, but because the process that the software prescribes.

Allen: Yes, but they may or may not have buy-in on that process across the org.

Ian: Exactly. So again, it comes back to the processes being too tacit and not being explicit enough. And I think that sometimes can be a deal breaker for internal tools.

Allen: That’s a really great observation, and I think I’ve seen really successful internal tool projects across the large orgs often have a component where they first figure out, okay, what is the process now? Then they figure out where do we want the process to be? Then they figure out, what changes need to happen to the manual process first before we bring in the internal tool? And then they build the tool that will… And sometimes they learn things from that. If they’re doing it well, they almost always learn something where it’s like, “Oh, okay, we’re going to always do X before Y and then let’s have them change the manual process first,” which seems like unnecessary, extra work because it’s like, “Well, but this process is all going away. Why would you change the manual process?” It’s because A, we’ll get buy-in, and B, we might find out, if you do X before Y, it actually is impossible because of this reason we didn’t realize.

Ian: Yeah. And then that’s where the engineering mindset’s super helpful. And I think engineers sometimes, they work too downstream versus owning the process.

Allen: I love that. Being mindful of time, one last thing I wanted to on is this leadership and people role, which we talked a little bit about and how making sure people are aligned and things like that. But I know that a CTO role can vary very dramatically in between orgs, but almost always, there’s this strategic component of the big picture stuff and board meetings or whatever of this is the big thing we’re trying to do. But then however many layers down, there’s this need for actual tactical making the things happen, which depending on the org and depending person, sometimes that can be the challenge. Okay, we have this great strategic vision. How do we bring it about in the real world rapidly enough that the strategy doesn’t change before we bring this thing in? So what are some of the things that you’ve learned about the balance of the high level strategic thing versus the going down to the bottom layer into the details and doing that kind of work?

Ian: That areas has some of the toughest moments for me growing as an engineering leader over time, especially because when I started Opendoor, I was coding a lot. I was still very much an IC, and then towards the end, I didn’t have the capacity or time to code, and I was still reviewing some projects on an architectural basis. But definitely, unfortunately not at the code level. And I actually think this is where I wish I had followed a different set of advice because I think a lot of advice is, at least in the Valley, and the common wisdom is delegate as much as you can.

Allen: Yeah, especially for rapidly growing companies, you hear people just say, delegate as much as you can and then delegate even more, and they’ll push on almost, you will never delegate too much kind of idea.

Ian: There’s delegate a lot, and if you find yourself overworking, that means you haven’t found the right people and you need to delegate, which means you haven’t delegated enough. But I find that advice can quickly turn into abdication, meaning you’re actually not doing anything other than people management. And actually, the issue with that is it can lose sight of the strategy because the strategy needs to connect with what’s on the ground. An important part of the strategy as a CTO has to be developer productivity. But how do you know if your developers are not productive? You can do ask them and stuff like that, but you have to get a really intuitive feel for what’s happening on the ground. And this is where I wish I actually remained much more hands-on over time and actually create a culture where that’s acceptable for your managers to be in the weeds and your managers, your skip levels to be in the weeds.

Allen: Which is not the case. A lot of orgs will explicitly say, after this level, if you’re coding, you’re just holding your team back and things like that. You hear that.

Ian: And not necessarily coding because I think that would be holding the team back, but at least being quite involved in a technical strategy basis. Because when you think of strategy, there are also different parts of it. There’s the organizational strategy, but there’s the technical strategy and there’s the business strategy. So everything comes from the business strategy. What does the business need, not just today, but tomorrow and the month after? And then as an engineering leader, you’re trying to map the current system, both the actual software system, but also the people system to meet those goals. So I think it’s really important to actually stay super hands, on both the people side and the software side and the architectural side and bust some of what I think of as bad common wisdom around, “Hey, you’re micromanaging.” No, what I want to do is actually create alignment up and down the decision stack. What I mean by that is I need to be aligned with my team on, here’s where the company’s going, here’s the KPI for a team, here are the big projects and here’s how we’re working on the product, how we’re actually going to make that happen. There are different layers of the strategy stack, and you can’t just operate at a 10,000 foot level.

Allen: You get these architecture astronauts that they joke about.

Ian: Yeah, because again, you just become this paper pusher, and in fact, you’re actually destroying value in the company. The value comes from actually knowing what to build and creating. Not just seeing ahead, but being able to go up and down stack to move the company forward in that direction. And it sets expectations both ways, for the manage to just delegate, fire, and forget or just trust people.

Allen: Trust but verify.

Ian: But not verifying.

Allen: Right.

Ian: Yeah. So I think that’s really important. So how do you actually create that expectation both up and down across all levels to talk about not just the business strategy or the product strategy, but also the technical strategy and get in the weeds and create space and an acceptance for all levels, that, “Hey, this is good.” You want your leaders to stay very technical. So I think that’s something that I’ve learned, is alignment isn’t just at the top. Yes, as a CTO, every single engineering all-hands. I want to make sure I repeat some of the really high level objectives for the organization. You got to talk about it 10, 20, 30 times for it to sink in, but also, what are the processes at the lower level altitudes where you’re reviewing not just the outputs, but also the inputs. Not just like, “Hey, these are the objectives that we accomplished.” But also, how do we accomplish those objectives?

Allen: I think that’s something that’s appealing to a lot of people who are moving up in their career, maybe they go from IC to a manager. Certainly once they get into director levels or whatever, a lot of folks instinctually wish they were more involved or stayed involved in those decisions, but then are pushed by like you say, this common wisdom of delegate, delegate, delegate. You can’t be micromanaging, which is true. I think there’s general agreement, everyone defines micromanagement differently, but I think everyone agrees that if you’re calling it micromanaging, it’s bad. So it’s by definition. But maybe as a last question, what do you see as the distinction in between micromanaging, which by its name, means you’re over constraining and burdening people who maybe understand the details better than you do with your nitpicking versus being a good support for people at the lower levels where you’re helping make sure that they’re all pulling in the same direction, and that they’re getting good benefits for their strategy alignment all the way at the top and down. Without doing the micromanaging, like, “Hey, this semicolon and this thing isn’t the way I want it to be.”

Ian: Yeah. So you go on a PR and you knit all the formatting issues.

Allen: Yeah, exactly. very junior engineer, you’re sending them 10 comments per commit.

Ian: So that’s clearly nitpicking, but it’s hard because I think it’s relative to what the company needs and relative to expectations. And in some areas that are super strategic to the company, unfortunately, you have to dig in maybe 2, 3, 4 levels deep, whereas there are times when things are not as much on fire and you have it more dialed back. It’s fine to actually be more hands off there. So there is a lot of it depends, in this, but I’d say that one key thing I’ll come back to is again, setting the right expectations because people feel micromanaged when the expectations are set such that you’re not intervening at all or you’re not inspecting at all. Can you imagine as a CTP, if I don’t get inspected by the CEO?

Allen: That’d probably be bad.

Ian: Right? That’d be corporate malpractice, every business unit leader gets inspected through, again, these business reviews and whatnot. And I think there needs to be a tone that’s set around, my work’s going to get inspected and it’s going to make it better.

Allen: You’re going to get useful feedback as opposed to just semicolon nitpicks.

Ian: Right, yeah. Engineering is by nature, very much a democracy. There’s these tones of peer review feedback, it’s very collaborative. That’s a very agile and great culture. And also, I think it’s important to recognize that we operate with an organizational structure where the managers actually also need to be involved somehow, but they’re not going to be your peers who are reviewing your PRs necessarily. And so the expectation that needs to happen is almost like a review of the system or the architecture that’s at a bit of a higher level, and that needs to be part of the organizational processes. Again, it could be a review, there could be other processes. It doesn’t have to be super formal, but every company will come up with their own set of practices. But again, I think it comes back to setting the expectations that, “Hey, I want to get my work reviewed and inspected because A, it’s part of my job, and B, I think it’s going to make the office better.”

Allen: Yeah, I like that. And I think that aligns well with some of the things you’ve been hearing and reading about in the last few years about the latest generation of folks that are younger than us, disproportionately want feedback and maybe come into organizations where everything is delegated and their response is like, “Yeah, I’m doing this work. I think that I’m doing the right thing, I think, but no one’s giving me any feedback and I feel unmoored.” And so it seems like maybe there’s some alignment there in between helping people develop people into stronger and stronger coworkers, and giving them that feedback. And then also maybe me a little more connected from the top and the bottom rather than just architecture astronauts at the top.

Ian: Yeah, exactly. That’s definitely an anti-pattern, and unfortunately, too common.

Allen: Cool. Well, thanks Ian. I really appreciate the time. Where can people go to learn more about you and your work?

Ian: I’m just ihat, I-H-A-T on Twitter. It’s a nod to Physics NEE, and that’s where people can find me.

Allen: Excellent. Well, we’ll link that up in the show notes. It Shipped That Way is brought to you by Steamclock software. If you’re a growing business and your customers need a really nice mobile app, get in touch with Steamclock. That’s it for today. You can give us feedback or rate the show by going to itshipped.fm or @shippedfm on Twitter. Until next time, keep shipping.

Read More