Electricity Consumption vs. Weather
Skill Set Development - Data science, machine learning techniques, R programming, Tableau, Microsoft Excel, energy market expertise
Project Role - Associate Research Supervisor
The Big Idea
What if you were able to accurately predict real-time electrical energy consumption in relation to weather forecast information?
The Facts
HVAC systems can consume over half of all energy consumption in both residential and commercial sectors. In a simplified manner, these systems manage internal environment control. “Heating Days” and “Cooling Days”, as the industry refers to them, are periods during which electrical energy is converted in order to maintain a given internal space’s temperature, acting inversely to the external environmental conditions.
Weather’s (non-granular) cyclical nature, the abundance of historical local/regional data, and the state of today’s forecasting technology allows for accurate, easily-accessible relations to be built between environmental conditions and HVAC systems’ power demand.
The Development
In late October of 2015, a team of researchers, led by Dr. Andrew Lang, met with THG, an energy solutions consulting firm based out of Austin, TX. THG proposed the idea of constructing more accurate models of the link between energy consumption and weather patterns, as they were only utilizing linear regression mathematical techniques.
Compilation
Data from a large dining/lecture hall on the campus of a local university was selected as the best candidate for advanced analysis location, as 15-minute incremental energy usage was readily available.
24 variables related to local weather patterns were acquired using US government data portals.
Curation
University data was adjusted to show unique day of week, day of year, and more so as to seamlessly merge with weather pattern variables. Additionally, energy consumption time-period (15-minute increments) was truncated to match hourly readings from governmental weather data.
The weather dataset was curated in order to match typical forecasted variables (i.e. “downward solar radiation” was dropped, while “wind direction” was retained). Additionally, unique day of week, day of year, and time of day were matched in unique character designation to university data.
Variable importance analysis was run, in order to clarify dataset with only variables that held significant relevance to energy consumption results, thus removing an extra “noise” from the total, merged dataset.
Implementation
In order to validate probability tree modeling method, a historic, smaller time-series data set was used to create and train the model, then used to predict a future, existing time series. The predicted time series was then compared to actual, historical time series. After a few iterations, resulting coefficients of determination were good enough to validate the .randomForest package, and the entirety of the 5-year dataset was consumed, and the time-series was scaled out to an entire year, in order to predict electrical consumption from January to December in the upcoming year.
The Results
Variable importance was determined through preliminary .randomForest analysis.
As compared to the industry standard’s linear regression analysis, the team’s machine learning techniques predicted energy consumption at an amazing R^2 value of 0.94, as compared to linear regression’s R^2 value of 0.37.
Comparisons between median values of daily energy consumption, randomForest model’s output, and linear regression’s results are shown top to bottom, respectively. The second visualization includes relative humidity as a stacked variable.
From these results, the power of advanced analytics in the energy solutions space is quite effective.
This mathematical model was applied in order to:
Generate more accurate budget forecasting
Normalize weather-based performance comparisons from year to year
Improve Measurement and Verification (M&V) savings calculations and cost avoidance
Optimize energy demand based on local electricity provider’s peak cost matrix
More accurately generate P&L reporting