Scoring Bayesian networks of mixed variables
Andrews B, Ramsey J, Cooper GF. Scoring Bayesian networks of mixed variables. International Journal of Data Science and Analytics (2018) 1-16. First Online 11 January 2018.
In this paper we outline two novel scoring methods for learning Bayesian networks in the presence of both continuous and
discrete variables, that is, mixed variables. While much work has been done in the domain of automated Bayesian network
learning, few studies have investigated this task in the presence of both continuous and discrete variables while focusing on
scalability. Our goal is to provide two novel and scalable scoring functions capable of handling mixed variables. The first
method, the Conditional Gaussian (CG) score, provides a highly efficient option. The second method, the Mixed Variable
Polynomial (MVP) score, allows for a wider range of modeled relationships, including nonlinearity, but it is slower than CG.
Both methods calculate log likelihood and degrees of freedom terms, which are incorporated into a Bayesian Information
Criterion (BIC) score. Additionally, we introduce a structure prior for efficient learning of large networks and a simplification
in scoring the discrete case which performs well empirically.While the core of this work focuses on applications in the search
and score paradigm, we also show how the introduced scoring functions may be readily adapted as conditional independence
tests for constraint-based Bayesian network learning algorithms. Lastly, we describe ways to simulate networks of mixed
variable types and evaluate our proposed methods on such simulations.