Making a Better World: One P-value at a Time
Wayne Lee discusses his experience as a Taiwanese-American and explains why he left Silicon Valley to teach statistics.
Statistics may seem pretty cut and dry, but Wayne T. Lee is on a mission to prove otherwise.
“I know it sounds crazy,” he said. “But statistics can make the world a better place.”
Like historians and journalists, statisticians dig up data, interpret them, and try to weave them into a convincing story. Data aren’t absolute, but rather open to multiple interpretations, Lee said, and that’s why an education in statistical thinking can help the world find common ground.
“Statistics gives you a framework to think about uncertainty and how it affects our daily lives,” he said. “I might decide to run down the stairs of the 116th Street subway station after watching someone else do that. We’re constantly making up stories based on limited and imperfect data.”
Lee worked in the corporate world before coming to Columbia, including at the Climate Corp., a company founded by two ex-Googlers to help farmers increase yields with the help of big data. His role there focused on data quality, or ensuring that the information used by farmers, scientists, and analysts could be trusted. “I had to know that the AI people want image pixels,” he said. “Growers want field-level summaries, and researchers only want to know whether their chosen hybrid is winning the race."
"It sounds like you’re the garbage man cleaning up people’s mess,” he added. “No one wants to do it. But it’s so fun!”
In a recent interview with Columbia News, Lee made a case for how statistics can be fun and why everyone should strive for statistical literacy.
What’s the biggest misconception about your field?
That it’s all bad hypothesis testing—running a routine calculation to get a p-value to know if a result is publishable. Something a calculator does. Soul sucking, zero creativity. Statistics is much bigger! It’s the scientific method with data. You observe the world, define the data, test it. It sounds boring, but you have to decide what’s an observation and how do you record it? How do you take an abstract idea and come up with a way to measure it? And does that metric capture what you’re trying to show?
That sounds like data science! How is statistics different?
Data scientists should be intensely focused on business outcomes. Their mission is to discover new opportunities for the data, to generate value from it. More like a product manager with a quantitative mindset. You need to think like an MBA. ‘What’s the market size? How can we 10x the solution?’ A statistician is closer to a researcher.
What’s one statistical concept everyone should know?
I’m going to cheat and give you three. One, not all problems can be answered with data. Humility is necessary. Two, be creative to imagine how the same set of data can have different explanations. There might be several reasons to support the statement, ‘the government is terrible.’ But there may be other explanations, too, including a distrust of government. Rom-coms do this all the time. Someone stumbles, and the other character reaches for the common explanation. But the viewer knows there are different reasons for this one outcome.
Three, know yourself well enough to know what data can change your mind. I’ve come to realize that people don’t know what they need to hear to change their opinion. Instead, most people opt for the ‘I'll know it when I see it’ strategy, which leads to a lot of second guessing and disappointment.
3 statistical concepts to save the world:
— Wayne Tai Lee (@wtailee0) November 23, 2019
- not all problems are data problems
- imagining different ways the same data can be generated
- articulating the data required to convince you in any debate
https://t.co/KYFR8BqBSn
What made you leave industry for Columbia?
I asked myself where my job was leading. I realized that if everyone learned statistics, the world would be a better place. Statistics is unique in its rigorous explanation of how similar things can be quite different and its tolerance for uncertainty.
Over time, I've found my statistical training has helped me articulate the different needs of farmers, people working on machine learning models, and those making business decisions. In a world of specialists, statisticians are able to easily connect the dots across domains of knowledge.
I also believe statistics can learn from data science. We can be more constructive, outcome focused, and collaborative. Statisticians often complain that they’re only invited in at the end of a project. But that’s because statistics has a PR problem. I'm excited to try and change the culture.
Culture? What do you mean?
Statistics has a culture where you win points for tearing down someone else's argument, and the focus is on problems with clean data. Businesses need analytics that appear trivial, but they’re not once you consider the interactions between them. People need statisticians to help think through tough, messy problems with crappy data. We should try our best rather than constantly telling people they need better data.
You’ve said you want to create a scalable education for statistics. Why?
My courses benefit Columbia students, whereas online educational courses attract mostly motivated self-learners. How can we push statistical literacy further? How can we do for statistical literacy what Microsoft Office did for computer literacy? We need to have a good, fairly priced product with distribution and growth factored in at the start.
In Dance Lessons for Data Scientists, you have this great line, “No one wants to dance with a one-trick pony.” What do you mean by that?
In NYC, I’ve gotten into ContactImprov which involves dancing with people of different body types, ages, and so on. It forces you to listen, understand your biases, and keep the dance going, which I think carries over to working with datasets and domain experts. Tech people preach about innovation but it’s artists who truly understand how to reinvent themselves. They’re constantly exposing themselves to new people and ideas. Statisticians and data scientists can learn from artists and have multiple ‘tricks’ up their sleeve.
You were raised in Taiwan and returned to the U.S. for high school. What was that like?
My vocabulary was poor, like other ESL students in my high school. The Americans thought I wasn't very bright and the Taiwanese thought I didn't understand Chinese. This in-between state was tough but it pushed me to find my own community.
What has work been like as a Taiwanese-American?
At LinkedIn, the best team I worked on had no native English speakers. It forced us to be concise, ask for clarification, and be patient with one another. I believe I was passed on for promotion a few times when people who shared the same title as me thought I should pay the bill on business trips, believing I was more senior. Later I learned that my experience wasn't unique. There is a belief that East Asians don't make good managers so we often have to do the job without the title first.
Advice for students majoring in data science or statistics?
Take fewer classes and connect with more people to fully optimize your time. There’s so much to experience in college. Don’t limit yourself by overloading on courses. Classes are important but connecting with people will sharpen your sense of self. Focus on finding out what you like rather than finding one all-consuming passion. You'll realize that opportunities are everywhere.
One of your friends on LinkedIn writes, "Wayne is the exception to the rule, he's the Statistician who's the life of the party!" What's your secret?
Dancing at a party is more about how much fun you look like you're having rather than how good you are at dancing. Thinking otherwise is optimizing for the wrong objective. Have fun!