Skip to content
Link copied to clipboard
Link copied to clipboard

Those computer model snow forecasts for Philly were off — by 5 to 10 feet

Believe it or not, meteorologists say the U.S. Global Forecast System Model and other forecast models are getting better.

It was quite a snowy winter around here, at least virtually. Computer models kept heaping it on us, and it kept not happening. That wasn’t the case in 2016, when this photograph was taken.
It was quite a snowy winter around here, at least virtually. Computer models kept heaping it on us, and it kept not happening. That wasn’t the case in 2016, when this photograph was taken. Read moreMichael Bryant / Staff photographer

With the region about to reach a climate benchmark, it is safe to conclude that this was one brutal winter — at least for the extended-range snow forecasts generated by computer models.

Using the highest amount the U.S. Global Forecast System Model foresaw among its computer runs within eight days of a predicted event, the model, or GFS, heaped 129.8 inches of snow upon Philly, according to an analysis by retired National Weather Service meteorologist Tony Gigi.

Evidently, hundreds of thousands of backs in the region were spared: The actual seasonal total measured at Philadelphia International Airport was 8.1 inches.

With 99.9% of the votes counted — April 27 marking the latest date that measurable snow was ever recorded in Philly — that’s a miss of about 10 feet.

The European forecast model, or Euro, which meteorologists generally have considered superior to the U.S. model, was better, but still weighed in at a hefty 64.8 inches, applying the same criteria.

Both models fumbled on the Super Bowl snow forecast for Feb. 9, with the GFS at one point weighing in at 6.7 inches and the Euro at 2.9. Actual amount: “trace.”

Both were in la-la land for a Feb. 19-20 fantasy event, with the Euro projecting 17.3 inches for Philly and the GFS 13.3. Actual amount: 0.1 inches.

The GFS is scheduled for a “major upgrade” next year, said Alicia Bentley, a scientist at the National Oceanic and Atmospheric Administration’s Environmental Modeling Center.

And meteorologists say the models overall are improving, even if they lag well behind the public’s expectations. Just how much cuts by President Donald Trump’s administration will affect potential upgrades of the GFS is about as uncertain as the weather on day 10 of a 10-day outlook.

The errors in the extended ranges may explain all those click-luring social media snow maps that far outdid the snow.

So what explains the virtual hallucinations?

How computer models work

Using observations across the planet to establish the current state of the atmosphere at every level, supercomputers then attempt to solve physics equations to forecast how the weather will change over time.

This all began 75 years ago, with the first numerical predictions generated by the ENIAC computer, developed at the Moore School of Electrical Engineering, now part of the University of Pennsylvania.

A renegade snowstorm in the first week of November 1953, which took an erratic path similar to that of Hurricane Sandy almost 60 years later and resulted in Philly’s heftiest early-season snow on record, spurred a major U.S. computer-forecasting initiative.

“Numerical weather prediction is one of the most complex and challenging technologies developed by our species,” University of Washington scientist Clifford Mass, a frequent critic of the U.S. model’s performance, has said.

Long-range snow forecasts are especially problematic, Bentley said, because the models have to be right about the circulation and whether most of the atmosphere would be below freezing.

That temperature profile plays havoc with forecasting precipitation types, said Zack Taylor, branch chief of NOAA’s Weather Prediction Center.

Additional wrinkles include short-term features such as “banding,” which can wring out heavy snows in narrow corridors, Taylor said.

The forecast supercomputers are immensely expensive to operate, out of the reach of private enterprise.

The GFS and the Euro are the world’s most commonly used models.

The differences

The models use a “similar number” of observations, said the modeling center’s Daryl Kleist. They have multiple differences in how they process what they ingest, but one of the most important ones is the spatial resolution of their “grids.”

The U.S. model breaks down the forecast territory into grids 18 miles apart. The Euro’s grids are three times smaller, providing better resolution. Think of pixels in an image on a screen.

The GFS is run by the U.S. government and is free to the world. Every six hours it forecasts out to 10 days for several variables — including temperature, air pressure, and precipitation. (Gigi’s analysis was limited to the two most reliable daily runs.)

The European is produced by a consortium of 35 nations, with its headquarters in Reading, England. It is costlier to operate, and users have to pay fees. It updates its extended outlooks every 12 hours.

Are they really getting better?

Yes, say the meteorologists who use them.

The GFS has had its moments.

It outdid the Euro in forecasting devastating Hurricane Dorian in 2019, foreseeing its formation five days before it happened.

But the Euro far outperformed the GFS in predicting the path of Sandy, and an analysis by meteorological researcher and former NOAA official Ryan Maue showed that since 2008 the Euro’s five-day outlooks have consistently been better than the U.S. model’s.

The GFS “shows slight improvement in the last few years,” said Greg Diamond, a meteorologist with Fox Weather, “but it hasn’t been a substantial improvement.”

Diamond said that the Euro “does a better job” of capturing the current state of the atmosphere, but that the models likely never will be perfect because the initial conditions are so elusive.

The world has any number of observation holes, especially over the oceans that cover 70% of the planet.

Satellites are useful but not in a league with ground-truth observation — which is why there was such a hubbub over the possible discontinuation of weather-balloon launchings in the Northeast as part of the Trump cuts.

One method of accounting for observational flaws is “ensemble forecasting,” in which each model solution is tweaked multiple times with subtle variations of the initial conditions and other factors.

If a nor’easter is heading toward the Philly region, and the projected paths look similar after all that tweaking, meteorologists can feel more confident in the forecast.

The outlook for the GFS

To get better, among other things the GFS would need better observations and tighter resolution, Mass has said, along with “statistical techniques like machine learning to fill in gaps in physical knowledge.”

“A key problem is that too many U.S. groups are working on their own modeling systems,” he said, citing NASA, the Navy, and the Air Force. He said such “redundant work” divides limited resources, but the GFS’s problems are fixable.

“The United States was once the leader in numerical weather prediction and could be again,” he said.

But don’t expect the models to outperform expectations that have been raised by the likes of apps that offer computer-generated hourly forecasts five and six days in advance, said Fox Weather’s Diamond.

“The science is not there to make an extremely accurate forecast to that level, that far away,” he said. You might trust them two or three days out if a snowstorm or major rain event is in play, he said.

The atmosphere is a “chaotic system,” said the modeling center’s acting director, Rich Bandy, and sooner or later, “you start running into fundamental predictability limits.”