AbstractThe relationship between the demographical characteristics of building occupants and their perception of indoor comfort is increasingly being studied. However, the added value from accounting for such characteristics when modeling and predicting occupants’ perceptions remains unclear. An incremental machine learning (ML) modeling and analysis approach is proposed to quantify the influence of four demographical factors (gender, age, nationality, and time lived in the environment) on occupants’ perceptions of their indoor environment conditions. A three-step methodology is presented: (1) data collection through sensors and a questionnaire administered on 206 occupants of academic and office buildings in Abu Dhabi, UAE, (2) development of ML models (i.e., support vector machine, random forest, and gradient boosting) to predict occupants’ perceptions under different scenarios of demographical representation (i.e., from no representation to all demographical parameters included), and (3) analysis of the impact of demographical parameters’ inclusion on the performance of the ML models in terms of predictive accuracy, F1-scores, and computing time. Results confirm that including demographical variables could increase prediction accuracy and F1-scores by approximately 19% and 56%, respectively. However, in some instances, the inclusion of these variables reduced model performance while increasing computing time by as much as 50%. A detailed discussion is presented on the comparative performance of the different tested ML algorithms and the need to strike a balance between increasing model complexity and computational costs.