tl;dr - Should I use ComStock?

Motivation: Why do we Need a Commercial Building Stock Energy Model?

Efforts by concerned citizens to create clean energy, or decarbonization, initiatives in cities, counties, and states across the United States continues to grow. The goals set by these initiatives are often aspirational, targeting a 100% clean, renewable, energy supply by a specific date for a specific geographic area. When considering the task of decarbonizing an electric grid, supply-side solar PV and wind generation are often the first technologies that come to mind. However, a grid’s demand side also offers significant decarbonization opportunities through various technology pathways. In the United States for example, nonelectric (fossil) fuel sources used primarily for space and water heating account for 51% of on-site energy usage in commercial buildings (LLNL 2019). Even if a grid is converted to 100% “green” energy through renewables and batteries, more than half of nonelectric on-site energy remains to be decarbonized. A major effort is required to achieve clean energy goals on the demand side and it falls on public sector staff, the engineering and policy consulting communities, and research organizations to ensure that these goals are realistic, equitable, and achievable.

Understanding how the commercial building stock uses and saves energy is a first step towards meeting these goals. Currently available energy analysis tools including energy audits and building energy models focus on individual buildings and a static power grid mix. However, the power grid mix continues to change as it incorporates wind, solar PV, and batteries. Advanced building controls and demand response programs make possible grid-interactive efficient buildings (GEB) that can achieve greater savings by responding to real-time changes in the power grid mix. Time becomes a vitally important factor when considering the changing energy supply and demand environment. The time of day or night when building energy efficiency measures provide energy savings needs to be identified and then correlated with the power grid mix. Do these savings occur at night when wind is on the grid or during daytime PV production? These considerations are further impacted by the geographic climate zone location.

To meet clean energy goals and support effective building stock energy integration with a changing power grid mix, a comprehensive analysis technique is required that can simultaneously analyze where, when, and how groups of buildings consume and can save energy. The ComStock™ analysis tool was developed by NREL with funding from the U.S. Department of Energy (DOE) to assist the professionals and researchers tasked with implementing these initiatives.


The commercial building sector stock model, or ComStock, is a highly granular, bottom-up model that uses multiple data sources, statistical sampling methods, and advanced building energy simulations to estimate the annual sub-hourly energy consumption of the commercial building stock across the United States.

ComStock™ asks and answers two questions: how is energy used in the U.S. building stock and what are the impact of energy saving technologies. Specifically, ComStock™ identifies where energy is being consumed geographically, in what building types and end uses, and at what times of day. Simultaneously, it identifies the impact of efficiency measures: how much energy do efficiency measures save; where, or in what use cases do measures save energy; when, or at what time of day do savings occur; and which building stock segments have the biggest savings potential.

This type of analysis can be conducted using simple and fast or complex and slow modeling methods. Each methodology has benefits and trade-offs. The National Energy Modeling System (NEMS) used by the Energy Information Administration (EIA) is an example of a simple representation and fast execution method. It models the entire U.S. energy system on the level of census regions and has a very low granularity of results for the building stock. Modeling each individual building within the building stock is an example of the complex, slow method. It offers a high granularity of results, more detail than is needed, and is highly impractical.

The ComStock methodology is positioned somewhere in between the two. It strikes the right balance by targeting just enough information to answer the two questions it poses. ComStock provides highly granular building stock data to capture the diversity within the building stock, hourly or sub-hourly level detail because time is a critical factor, modeling of controls, demand response, and measure interactions to achieve deep retrofits, and the ability to slice and dice the data to extract as many insights as possible out of the simulations.

Professionals and researchers have two pathways for using the ComStock analysis tool. They can interact with the results data set through a web-based visualization platform accessed through their local computer or, if they want to go deeper, they can interact with the raw simulation results data set, which requires a big-data skill set and a high-performance computing asset. Both options are discussed in greater detail below in the Using ComStock section.

How ComStock Works

Every aspect of a commercial building influences its energy consumption—often in complex and interrelated ways—thus necessitating robust modeling engines seeded with accurate building characteristics. Comstock’s fundamental goal is to accurately represent the U.S. building stock. It uses 350,000 modeling archetypes made up of distributions of building characteristics that collectively represent today’s building stock. Each modeling archetype represents a unique collection of buildings in a specific geographic location.

It is not practical to identify the building characteristics of each individual commercial building in the U.S. because in many cases, the relevant information is not available on a national basis for all buildings. It is practical, however, to identify the distribution of U.S. commercial building characteristics from available building characteristic data sets. For example, CBECS, the U.S. Energy Information Administration’s (EIA 2016a) Commercial Buildings Energy Consumption Survey, divides commercial building total floor space (million square feet) into ranges and lists the number of commercial buildings in each range. It identifies the percentage or distribution of buildings in each range for the ‘total floor space’ building stock characteristic.

Distribution of the U.S. commercial building stock by square foot

ComStock uses the available building characteristic data sets to identify the distribution of the building stock characteristics in each of its 350,000 modeling archetypes. In this way, ComStock models the building stock not individual buildings. Consider the puppy pictures in Fig. 2. The puppy on the left represents the fidelity achieved in modeling an individual building. The puppy on the right represents the fidelity of modeling the building stock, where every pixel is a modeling archetype. There is enough fidelity to see that the picture represents a puppy or, just enough information to answer the two questions that ComStock poses: how is energy used in the U.S. building stock and what is the impact of energy saving technologies. The goal in developing the set of 350,000 modeling archetypes is to be able to find a model that is similar to any commercial building that exists today in the U.S. and to know that the proportion of this type of model matches the proportion of this type of building stock existing today.

ComStock models the building stock, not the building

ComStock accomplishes its goal of accurately representing the U.S. building stock through a three-part workflow process. It creates modeling archetypes based on data in the building stock characteristics database. It translates the modeling archetypes into physics-based models so the baseline case as well as future energy efficiency measures (EEMs) can be defined. And it runs energy simulations using high performance computing (HPC) to evaluate each model. To validate accuracy, the models are calibrated, and the resulting data is made available to a wide range of stakeholders.

Building stock characteristics database

Accurate high-resolution data on the type, size, location, and characteristics of existing commercial buildings is required to populate the modeling archetypes and simulate end-use energy in U.S. commercial buildings. The building characteristics database is a combination of the data contained within several data sets including, CBECS, CoStar (2017), a commercial real estate inventory with specific building data, and other smaller data sets. Some of the data sets are proprietary and public access to that data in the ComStock tool is restricted. The primary building characteristics used in ComStock were chosen based on previous modeling experience, available data, and engineering judgement.

Physics-based computer modeling

Each modeling archetype in the building stock is converted to an EnergyPlus model, just as each ComStock pixel in the pixilated puppy picture represents an energy model that can be run. The entire pixilated puppy picture represents the 350,000 energy models that make up ComStock’s representation of the entire U.S. commercial building stock. For purposes of the ComStock analysis, it is assumed that buildings are built to code. While there is some flexibility in assuming non-code compliance or beyond-code compliance, to date, no data set of national as-built information exists that could provide robust input data.

The number of modeling archetypes that need to be run depends on the specific use cases and/or the type of analysis. For example, calculating end-use consumption will require running far fewer model archetypes than the iterative process of evaluating potential upgrades to the building stock. To increase fidelity when conducting a city- or state-scale analysis, a higher sampling rate may be required, resulting in running more modeling archetypes.

High performance computing

Running the hundreds of thousands of simulations required to perform a national scale building stock analysis, for example, requires HPC. Simulations on this order must be run on a supercomputer, if available, or on a cloud computing service. It should be noted that running this many simulations, even if HPC is available, is a non-trivial undertaking requiring a big-data skill set. Slicing, dicing, and aggregating the data to extract meaningful insights is another simulation-intensive iterative process that requires big-data skills. However, the ComStock team has developed the Visualization Platform, a web-based data viewer that lowers the barrier to entry for users who lack the necessary big-data skill set. It is discussed in detail below.

Calibration process

The calibration process improves the accuracy of energy models. ComStock models are calibrated by comparing model outputs to real-world, ground-truth data. When discrepancies are identified, domain knowledge is used to locate new input data so all affected modeling archetypes can be refined. Discrepancies can occur on the building stock characteristics side, like the percent of primary schools that use electric heating, or on the modeling side, like plug load schedules for office buildings or performance curves for heat pumps. Updates are only made when real-world ground-truth data is available to justify making changes to the model inputs. Ten commercial, end use, sub-metered, calibration data sets were procured for commercial end use calibration due to the lack of publicly available data sets. ComStock developers are constantly on the lookout for new data sets that can be used to calibrate the energy models.

Calibration approach

Comstock uses a three-step calibration process. First, model outputs are checked for discrepancies by comparing them to widely available data such as EIA’s annual U.S. utility electricity sales data and a group of aggregate customer class load profiles provided by a dozen specific utilities across the U.S. Second, the transferability of non-weather-dependent end uses is studied to understand how important it is to capture usage pattern differences between, for example, commercial plug loads in different regions of the country. Finally, an enhanced level of calibration is performed for five regions of the country where more granular level data is available including, utility AMI data from customers’ smart meters and metadata about building square footage, type, and vintage.

Using ComStock

When one considers using a modeling tool, the first thought is often, “Cool, I can load this on my computer and start running simulations, modifying measures, and generating results.” That is not the case with ComStock. Given the complexity of the software workflow and the big-data skill set/computing hardware required, the pathway for professionals and researchers to use ComStock successfully is to interact with the pre-created results rather than running ComStock.

Visualization Platform

The ComStock team has created a web-based data visualization platform composed of the raw simulation results generated from running all 350,000 model archetypes using the entire EEM collection and various OpenStudio Standards implementations. Using the visualization platform, this pre-created set of results can be sliced, diced, and reanalyzed to answer questions concerning how energy is used in the U.S. building stock and the impacts of energy saving technologies.

The visualization platform allows the results to be sliced and diced in different ways to filter down for specific results. For example: “I want to see results for a certain type of building in a specific climate zone, with a specific size and type of HVAC system.” The request can be filtered down with a graphical interface and the timeseries end use load shapes can be generated and viewed. The end use load shapes can be used as a starting point for other analyses. A key characteristic of this viewer is the ability to filter things down according to preference and then download the resulting data set. That data set can be further filtered for additional analyses and the resulting data set downloaded.

Raw Simulation Results

For those who want to take a deeper dive, it is possible to interact with the raw simulation results dataset, however, this requires the use of a high-performance computing asset and a proven big-data skillset.

The Raw Simulation Results are a data set based on 15 min time-series by end use data (cooling, heating, lighting, etc.,) for every building that is being modeled. The raw simulation results data set is about 50 terabytes in size. The raw simulation results are available through the auspices of the Open Energy Data Initiative (OEDI) - for information on how to query the results yourself directly please refer to this documentation. There are many ways that the raw results can be down-sampled or recombined to meet a specific use. When looking only at the buildings in a specific city or buildings of a certain type, it is possible to down-select for specific items of interest and then extract that data from the data set.

Mechanics – ComStock Behind the Scenes

ComStock is a series of complex software programs, or workflows, that run individually and are chained together. Each program needs to be run correctly and the series of programs must be performed in sequential order for ComStock to produce accurate results. It is not a seamless program like Microsoft Excel where the program is opened, information and equations are typed in, and results are produced. Understanding the mechanics of ComStock will help potential users determine the level of effort that is required to successfully perform a ComStock analysis.

Commercial Building Stock Definition

The Commercial Building Stock Definition program contains the distributions of building characteristics and also a code for sampling distributions. The sampling code step is necessary because several of the data sets that ComStock uses are proprietary or access-controlled. For example, CoStar, a commercial real estate data company, supplies a proprietary data set that is not publicly available elsewhere. The Department of Homeland Security maintains data sets from hospitals and schools that are access-controlled to prevent revealing their locations. ComStock cannot publish the distributions of building characteristics from these restricted data sets. It can and does however, publish a representation of the commercial building stock file that describes the aggregate samples for the whole U.S based on those restricted distributions. This can be thought of as publishing the pixilated version of the puppy picture rather than the original photo. The pre-sampled file can be used to run simulations, but the distribution of characteristics of the buildings are not modifiable.


The OpenStudio Standards program builds the energy models. It contains the software code needed to add all building systems for each vintage, primarily based on building energy codes and the DOE prototype buildings. It can be modified with better data through the calibration process. The OpenStudio standards were not originally developed for ComStock and are used for many other purposes. The standards represent the collaborative work of many researchers at Lawrence Berkeley National Laboratory, Pacific Northwest National Laboratory, and Oak Ridge National Laboratory.

EEM Collection

The EEM collection is a physics-based representation of common EEMs. An important aspect of each EEM in the collection is that it contains an explicit definition of the criteria that need to be met for the EEM to be applicable to a given modeling archetype. When a modeling archetype that does not meet the EEM criteria definition is run, the program returns a note indicating that the EEM is not applicable. For example, a variable air volume rooftop unit EEM will apply to modeling archetypes containing constant air volume rooftop units but a note will be returned to indicate that the EEM doesn’t apply in modeling archetypes that have variable air volume rooftop units.

ComStock Project

The Comstock Project is not a program but a grouping that effectively defines the question being asked. It takes building stock definitions, an OpenStudio Standards implementation, a selection of EEMs and combines that into a project that gets handed off the BuildStockBatch program. For example, a ComStock project to analyze the building envelope savings potential for a specific state, say Colorado, would include: all buildings in Colorado, OpenStudio Standards ver 1.23, and several efficiency levels for walls, roofs, and windows. In an iterative process, this ComStock project would identify the benefits, if any, of bringing Colorado commercial building envelopes up to code and/or above code.


The BuildStockBatch program uses batch queueing and processing systems. This software program takes the ComStock Project groupings and sends them off to be run on the computing resource. Because of the scale and number of simulations, BuildStockBatch is run on a supercomputer or cloud computing service.

Making the Right Choice

ComStock, used either through the website or by directly querying the results, provides ways to ask and answer a huge range of questions. In either case we ask all potential users of ComStock should keep in some caveats:

  1. The ComStock baseline data has been calibrated through the End Use Load Profiles project, which focused primarily on electricity consumption and the end-use level. The End Use Load Profiles for the U.S. Building Stock dataset has been calibrated and verified at a high level, and a report detailing the comparisons of this dataset to measured data will be published in December 2021. In the first version of this dataset, the natural gas consumption is significantly lower than shown by other data sources, and the team is actively working to address this. Revised datasets will be published as ComStock is improved.
  2. ComStock's underling data sources are far less complete and reliable in rural areas and places without active real estate markets. We believe the results in these areas to be substantially lower than the real number / square footage of commercial buildings in these areas. This is an area of improvement the ComStock team is actively pursuing.

Using either the ComStock website or the database can be difficult if you are not already familiar with commercial buildings, how they use energy, and what sort of upgrades can be made to increase / change their energy usage. Being a building scientist certainly isn't necessary, but knowing the difference between PTACs and RTUs makes using this data quite a bit simpler. If you have more of a policy or general sustainability focus in may make sense to ask a friend / consultant to help you frame your questions and review the answers. If you don't have experience coding or working with databases and need large data dumps it may make sense to hire a subcontractor who is used to working with cloud hosted databases. NREL and BTO are very interested in making the website more useful to non-buildings people however - if you have thoughts or suggestions or would be willing to be interviewed in that process please let us know!

Using the ComStock website requires almost no upfront investment, and results can be downloaded in CSV files for additional offline analysis or the web pages exported as reports for sharing with others. Some questions that are simple to ask and answer through the website:

  • What percent of energy in a state is spent on cooling versus interior lighting?
  • What's the technical potential of high-efficiency roofing in Warehouses in a region?
  • Does a measure save more energy during 4-6 PM than 11 AM to 1PM?
  • What is the annual 15-minute load shape for HVAC energy use in small offices in a state?


Lawrence Livermore National Laboratory. 2019. “Estimated U.S. Energy Consumption in 2019: 100.2 Quads.”

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now