covid-19 modeling, Youyang Gu, machine learning, data science


“It has become clear that we are not going to achieve collective immunity in 2021, at least certainly not nationwide,” he said. “And I think it’s important, especially if you’re trying to instill confidence, that we sort out sane paths to when we can.” return to normal. We should not associate this with an unrealistic goal, such as collective immunity. I remain cautiously optimistic about the validity of my initial February forecast, for a return to normal this summer. “

At the beginning of March, he packed his bags completely – he thought he had made the contribution he could. “I wanted to take a step back and let the other modellers and experts do their jobs,” he says. “I don’t want to confuse the space.”

He always keeps an eye on the data, doing research and analysis – on the variants, the vaccine rollout and the fourth wave. “If I see something particularly disturbing or disturbing that I think people aren’t talking about, I’ll definitely post it,” he says. But for now, he is focusing on other projects, such as “YOLO actions», A stock market analysis platform. His main work in the event of a pandemic is as a member of the World Health Organization’s technical advisory group on covid-19 mortality assessment, where he shares his stranger’s expertise.

“I have certainly learned a lot this year,” says Gu. “It was very revealing.”

Lesson # 1: Focus on the basics

“From a data science perspective, my models have shown the importance of simplicity, which is often underestimated,” says Gu. Its model for predicting death was not only simple in design – the SEIR component with a machine learning layer – but also in its very clean, “bottom-up” approach to input data. Bottom-up means “start with the bare minimum and add complexity as needed,” he says. “My model only uses past deaths to predict future deaths. It does not use any other real data source. “

Gu noted that other models relied on an eclectic variety of data on cases, hospitalizations, testing, mobility, mask use, co-morbidities, age distribution, demography, pneumonia seasonality, annual pneumonia death rate, population density, air pollution, altitude, smoking data, self-reported contacts, air passenger traffic, point of care, smart thermometers, Facebook posts, research Google, etc.

“It is believed that if you add more data to the model or make it more sophisticated, the model will perform better,” he says. “But in real life situations like the pandemic, where the data is so loud, you want to keep it as simple as possible.”

“I decided early on that past deaths are the best indicator of future deaths. It’s very simple: entry, exit. Adding more data sources will simply make it harder to extract the signal from the noise. “

Lesson # 2: Minimize Assumptions

Gu considers that he had an advantage in approaching the problem with a blank slate. “My goal was to just follow the data on covid to learn more about covid,” he says. “This is one of the main advantages from an outsider’s point of view.”

But not being an epidemiologist, Gu also had to make sure he wasn’t making incorrect or inaccurate assumptions. “My role is to design the model so that it can learn the assumptions for me,” he says.

“When new data goes against our beliefs, we sometimes tend to overlook or ignore it, which can have repercussions down the road,” he notes. “I certainly found myself a victim of this, and I know many others have as well.”

“So being aware of and acknowledging the potential bias we have, and being able to adjust our assumptions – adjust our beliefs if new data disproves them – is really important, especially in a fast-paced environment like this. we’ve seen with covid. “

Lesson 3: Test the hypothesis

“What I’ve seen over the past few months is that anyone can make claims or manipulate data to match the narrative of what they want to believe in,” Gu says. This highlights the importance of just making testable assumptions.

“For me, this is the whole basis of my projections and forecasts. I have a set of assumptions, and if those assumptions are true, then this is what we predict will happen in the future, ”he says. “And if the assumptions end up being wrong, then of course we have to admit that the assumptions we are making are not true and adjust accordingly. If you don’t make testable assumptions, there is no way to show whether you are really right or wrong. “

Lesson 4: Learn from mistakes

“Not all of the projections I made were correct,” Gu said. In May 2020, he predicted 180,000 deaths in the United States by August. “It’s much higher than what we’ve seen,” he recalls. His testable hypothesis turned out to be incorrect – “and it forced me to adjust my hypotheses.”

At the time, Gu used a fixed infection death rate of around 1% as a constant in the SEIR simulator. When, in the summer, he lowered the infection death rate to around 0.4% (and later to around 0.7%), his projections reverted to a more realistic range.

Lesson 5: Engage the critics

“Not everyone will agree with my ideas, and I welcome that,” says Gu, who has used Twitter to post his projections and analysis. “I try to respond to people as much as possible, to defend my position and to debate with people. It forces you to think about your assumptions and why you think they are correct. “

“It comes down to confirmation bias,” he says. “If I’m not in a position to defend my position properly, is that really the right claim, and should I be making these claims?” It helps me to understand, by engaging with other people, to think about these issues. When other people present evidence that goes against my positions, I need to be able to recognize when I may be wrong in some of my assumptions. And it actually helped me a lot to improve my model.

Lesson 6: Show healthy skepticism

“I’m a lot more skeptical of science now – and that’s not a bad thing,” Gu says. “I think it’s important to always question results, but in a healthy way. It’s a fine line. Because a lot of people categorically reject science, and neither is it the right way to go.

“But I think it’s also important not to blindly trust science,” he continues. “Scientists are not perfect.” If something is wrong, he said, ask questions and find explanations. “It’s important to have different perspectives. If there’s anything we’ve learned over the past year, it’s that no one is 100% right all the time.

“I can’t speak for all scientists, but my job is to remove all the noise and find out the truth,” he says. “I’m not saying I’ve been perfect this past year. I was wrong several times. But I think we can all learn to approach science as a method of finding the truth, rather than the truth itself.

Leave a Reply

Your email address will not be published. Required fields are marked *