NCCOR’s Youth Energy Expenditure (YEE) workgroup used three data sets—a pooled data set from five research groups, data from an extensive literature review, and data from a special supplement on youth physical activities—and an imputation process to develop the new Youth Compendium of Physical Activity. Additional information on the early development of the Youth Compendium can be found in the Background and History section of this website.

Based on preliminary analyses of the pooled data set, an *a priori* decision was made to divide the values presented in the Youth Compendium into four age groups—6 to 9, 10 to 12, 13 to 15, and 16 to 18 years. This decision was the clearest and most feasible approach to account for the age dependence of basal metabolic rate, as well as other age-specific changes in MET score that are likely related to biomechanical efficiency and other factors. Including values for children younger than age 6 years was not possible due to the paucity of activity and metabolic data. The workgroup also decided to use a youth MET (MET_{y}) metric to present the EE information. The MET_{y} was calculated from basal metabolic rate equations as published by Schofield (1985). This decision was based on poor consistency of measured basal EE values from the pooled data set and missing values from other published studies. The group realized that METs are usually calculated from resting metabolic rates and not basal metabolic rates (BMR), but if the two are measured in a post absorptive state, the difference is negligible (Schultz and Jequier. 1997). The use of BMR provides more consistency for calculating the MET value within each age group.

In total, EE of physical activities used for the new Youth Compendium were extracted from 137 pediatric studies representing more than 37,000 observations on children and adolescents up to age 18 years, the pooled dataset, and the JPAH supplement (Butte 2017). The EE data included 196 activities, including walking, running, and cycling at specific speeds. Activities were classified into 16 major categories taking into consideration body position (sitting, standing, lying down), upper or lower body movement, locomotion, weight or non-weight bearing, and intensity of effort. Age group-specific mean MET_{y} values were calculated for each specific activity. Tables with all the values were then developed.

Missing MET_{y} values for specific age groups and activities were imputed. For the imputations, each specific mean value for each age group for each activity was treated as one observation. The first step in the imputation process was to fit linear and quadratic regression models to determine the structure of the relationship between each age group and MET_{y} cost for each activity that had a sufficient number of observations (Proc GLM, SAS). Adjusted R^{2} was used to compare the linear and quadratic models to determine which model should be used to impute MET_{y} in age groups without measured observations. All the specific activities were found to have a linear component and, therefore, could be used for future imputation modeling.

The next step in the imputation process used a mixed model to impute missing values for age groups that had no observations for a given activity. To take advantage of similar types of movement within each of the 16 major categories, a mixed model was used to properly account for the clustering in the data by specific activity within each of the activity categories. The mixed model used data from similar activities within a major category, allowing for imputation of activities that had fewer observations. A macro was used to perform the imputation (Mistler 2013). A different imputation model was used for each activity category to predict MET_{y} values from a linear age term and included random intercepts for the different activities in a category. Each missing value was imputed 20 times. The midpoint of each age group was used to calculate the imputed values. The imputed values were reviewed and any values outside a lower bound of 1 and an upper bound of 3 standard deviations above the mean were replaced. Both the upper and lower bound adjustments took place after all imputation was performed. Thus, the Youth Compendium has no skipped imputations. From this completed dataset, a table of observed and imputed values for 196 specific activities for youth ages 6 to 18 years was generated for the Youth Compendium, showing the average MET_{y} values of each activity for each age group.

Finally, smoothed model-based MET_{y }values were calculated from the regression of the observed and imputed MET_{y }values on age. The activity- and age group-specific MET_{y} values were predicted using the fixed and random coefficients for the intercept and slope at the midpoint for each of the age groups. Smoothing the estimates served to reconcile the natural variability in the observed MET_{y} values within the major activity categories and across age groups within each activity. The default look-up values are based on the smoothed estimates.

## References

Butte NF, Watson KB, Ridley K, et al. A youth compendium of physical activities: activity codes and metabolic intensities. Med Sci Sports Exerc. 2017; doi: 10.1249/MSS.0000000000001430. Epub 2017 Sep 21.

Mistler SA. A SAS macro for applying multiple imputation to multilevel data. Phoenix, AZ: Arizona State University; 2013.

Schofield WN. Predicting basal metabolic rate, new standards and review of previous work. Hum Nutr Clin Nut. 1985;39 Suppl 1:5-41.

Schutz Y, Jequier E. Handbook of obesity. 1st ed. New York: Marcel Dekker; 1997. Resting energy expenditure, thermic effect of food, and total energy expenditure; p. 443-55.