Página 11 dos resultados de 2413 itens digitais encontrados em 0.020 segundos

## Bayesian optimization for materials design

Frazier, Peter I.; Wang, Jialei
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.431528%
We introduce Bayesian optimization, a technique developed for optimizing time-consuming engineering simulations and for fitting machine learning models on large datasets. Bayesian optimization guides the choice of experiments during materials design and discovery to find good material designs in as few experiments as possible. We focus on the case when materials designs are parameterized by a low-dimensional vector. Bayesian optimization is built on a statistical technique called Gaussian process regression, which allows predicting the performance of a new design based on previously tested designs. After providing a detailed introduction to Gaussian process regression, we introduce two Bayesian optimization methods: expected improvement, for design problems with noise-free evaluations; and the knowledge-gradient method, which generalizes expected improvement and may be used in design problems with noisy evaluations. Both methods are derived using a value-of-information analysis, and enjoy one-step Bayes-optimality.

## A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

Maas, Roland; Huemmer, Christian; Sehr, Armin; Kellermann, Walter
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431528%
This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches.

## Being Bayesian about Network Structure

Friedman, Nir; Koller, Daphne
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.431528%
In many domains, we are interested in analyzing the structure of the underlying distribution, e.g., whether one variable is a direct parent of the other. Bayesian model-selection attempts to find the MAP model and use its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have non-negligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed ordering over network variables. This allows us to compute, for a given ordering, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orderings rather than over network structures. The space of orderings is much smaller and more regular than the space of structures, and has a smoother posterior `landscape'. We present empirical results on synthetic and real-life datasets that compare our approach to full model averaging (when possible)...

## Accuracy of Latent-Variable Estimation in Bayesian Semi-Supervised Learning

Yamazaki, Keisuke
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.408918%
Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data...

## Inference-less Density Estimation using Copula Bayesian Networks

Elidan, Gal
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.408918%
We consider learning continuous probabilistic graphical models in the face of missing data. For non-Gaussian models, learning the parameters and structure of such models depends on our ability to perform efficient inference, and can be prohibitive even for relatively modest domains. Recently, we introduced the Copula Bayesian Network (CBN) density model - a flexible framework that captures complex high-dimensional dependency structures while offering direct control over the univariate marginals, leading to improved generalization. In this work we show that the CBN model also offers significant computational advantages when training data is partially observed. Concretely, we leverage on the specialized form of the model to derive a computationally amenable learning objective that is a lower bound on the log-likelihood function. Importantly, our energy-like bound circumvents the need for costly inference of an auxiliary distribution, thus facilitating practical learning of highdimensional densities. We demonstrate the effectiveness of our approach for learning the structure and parameters of a CBN model for two reallife continuous domains.; Comment: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

## Fast Learning of Relational Dependency Networks

Schulte, Oliver; Qian, Zhensong; Kirkpatrick, Arthur E.; Yin, Xiaoqian; Sun, Yan
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431511%
A Relational Dependency Network (RDN) is a directed graphical model widely used for multi-relational data. These networks allow cyclic dependencies, necessary to represent relational autocorrelations. We describe an approach for learning both the RDN's structure and its parameters, given an input relational database: First learn a Bayesian network (BN), then transform the Bayesian network to an RDN. Thus fast Bayes net learning can provide fast RDN learning. The BN-to-RDN transform comprises a simple, local adjustment of the Bayes net structure and a closed-form transform of the Bayes net parameters. This method can learn an RDN for a dataset with a million tuples in minutes. We empirically compare our approach to state-of-the art RDN learning methods that use functional gradient boosting, on five benchmark datasets. Learning RDNs via BNs scales much better to large datasets than learning RDNs with boosting, and provides competitive accuracy in predictions.; Comment: 17 pages, 2 figures, 3 tables, Accepted as long paper by ILP 2014, September 14- 16th, Nancy, France. Added the Appendix: Proof of Consistency Characterization

## Local Structure Discovery in Bayesian Networks

Niinimaki, Teppo; Parviainen, Pekka
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.408918%
Learning a Bayesian network structure from data is an NP-hard problem and thus exact algorithms are feasible only for small data sets. Therefore, network structures for larger networks are usually learned with various heuristics. Another approach to scaling up the structure learning is local learning. In local learning, the modeler has one or more target variables that are of special interest; he wants to learn the structure near the target variables and is not interested in the rest of the variables. In this paper, we present a score-based local learning algorithm called SLL. We conjecture that our algorithm is theoretically sound in the sense that it is optimal in the limit of large sample size. Empirical results suggest that SLL is competitive when compared to the constraint-based HITON algorithm. We also study the prospects of constructing the network structure for the whole node set based on local results by presenting two algorithms and comparing them to several heuristics.; Comment: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

## Optimally-Weighted Herding is Bayesian Quadrature

Huszar, Ferenc; Duvenaud, David
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.431528%
Herding and kernel herding are deterministic methods of choosing samples which summarise a probability distribution. A related task is choosing samples for estimating integrals using Bayesian quadrature. We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature. We then show that sequential Bayesian quadrature can be viewed as a weighted version of kernel herding which achieves performance superior to any other weighted herding method. We demonstrate empirically a rate of convergence faster than O(1/N). Our results also imply an upper bound on the empirical error of the Bayesian quadrature estimate.; Comment: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

## Scalable Bayesian Inference via Particle Mirror Descent

Dai, Bo; He, Niao; Dai, Hanjun; Song, Le
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.431528%
Bayesian methods are appealing in their flexibility in modeling complex data and their ability to capture uncertainty in parameters. However, when Bayes' rule does not result in closed-form, most approximate Bayesian inference algorithms lacks either scalability or rigorous guarantees. To tackle this challenge, we propose a scalable yet simple algorithm, Particle Mirror Descent (PMD), to iteratively approximate the posterior density. PMD is inspired by stochastic functional mirror descent where one descends in the density space using a small batch of data points at each iteration, and by particle filtering where one uses samples to approximate a function. We prove result of the first kind that, after $T$ iterations, PMD provides a posterior density estimator that converges in terms of $KL$-divergence to the true posterior in rate $O(1/\sqrt{T})$. We show that PMD is competitive to several scalable Bayesian algorithms in mixture models, Bayesian logistic regression, sparse Gaussian processes and latent Dirichlet allocation.; Comment: 32 pages, 26 figures

## Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Kulis, Brian; Jordan, Michael I.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431528%
Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Bayesian nonparametric viewpoint. Inspired by the asymptotic connection between k-means and mixtures of Gaussians, we show that a Gibbs sampling algorithm for the Dirichlet process mixture approaches a hard clustering algorithm in the limit, and further that the resulting algorithm monotonically minimizes an elegant underlying k-means-like clustering objective that includes a penalty for the number of clusters. We generalize this analysis to the case of clustering multiple data sets through a similar asymptotic argument with the hierarchical Dirichlet process. We also discuss further extensions that highlight the benefits of our analysis: i) a spectral relaxation involving thresholded eigenvectors, and ii) a normalized cut graph clustering algorithm that does not fix the number of clusters in the graph.; Comment: 14 pages. Updated based on the corresponding ICML paper

## Infinite Shift-invariant Grouped Multi-task Learning for Gaussian Processes

Wang, Yuyang; Khardon, Roni; Protopapas, Pavlos
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.408918%
Multi-task learning leverages shared information among data sets to improve the learning performance of individual tasks. The paper applies this framework for data where each task is a phase-shifted periodic time series. In particular, we develop a novel Bayesian nonparametric model capturing a mixture of Gaussian processes where each task is a sum of a group-specific function and a component capturing individual variation, in addition to each task being phase shifted. We develop an efficient \textsc{em} algorithm to learn the parameters of the model. As a special case we obtain the Gaussian mixture model and \textsc{em} algorithm for phased-shifted periodic time series. Furthermore, we extend the proposed model by using a Dirichlet Process prior and thereby leading to an infinite mixture model that is capable of doing automatic model selection. A Variational Bayesian approach is developed for inference in this model. Experiments in regression, classification and class discovery demonstrate the performance of the proposed models using both synthetic data and real-world time series data from astrophysics. Our methods are particularly useful when the time series are sparsely and non-synchronously sampled.; Comment: This is an extended version of our ECML 2010 paper entitled "Shift-invariant Grouped Multi-task Learning for Gaussian Processes"; ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III

## Bayesian Nonparametric Hidden Semi-Markov Models

Johnson, Matthew J.; Willsky, Alan S.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431528%
There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed mainly in the parametric frequentist setting, to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments.

## A PAC-Bayesian bound for Lifelong Learning

Pentina, Anastasia; Lampert, Christoph H.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.449673%
Transfer learning has received a lot of attention in the machine learning community over the last years, and several effective algorithms have been developed. However, relatively little is known about their theoretical properties, especially in the setting of lifelong learning, where the goal is to transfer information to tasks for which no data have been observed so far. In this work we study lifelong learning from a theoretical perspective. Our main result is a PAC-Bayesian generalization bound that offers a unified view on existing paradigms for transfer learning, such as the transfer of parameters or the transfer of low-dimensional representations. We also use the bound to derive two principled lifelong learning algorithms, and we show that these yield results comparable with existing methods.; Comment: to appear at ICML 2014

## Compound Poisson Processes, Latent Shrinkage Priors and Bayesian Nonconvex Penalization

Zhang, Zhihua; Li, Jin
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431528%
In this paper we discuss Bayesian nonconvex penalization for sparse learning problems. We explore a nonparametric formulation for latent shrinkage parameters using subordinators which are one-dimensional L\'{e}vy processes. We particularly study a family of continuous compound Poisson subordinators and a family of discrete compound Poisson subordinators. We exemplify four specific subordinators: Gamma, Poisson, negative binomial and squared Bessel subordinators. The Laplace exponents of the subordinators are Bernstein functions, so they can be used as sparsity-inducing nonconvex penalty functions. We exploit these subordinators in regression problems, yielding a hierarchical model with multiple regularization parameters. We devise ECME (Expectation/Conditional Maximization Either) algorithms to simultaneously estimate regression coefficients and regularization parameters. The empirical evaluation of simulated data shows that our approach is feasible and effective in high-dimensional data analysis.; Comment: Published at http://dx.doi.org/10.1214/14-BA892 in the Bayesian Analysis (http://projecteuclid.org/euclid.ba) by the International Society of Bayesian Analysis (http://bayesian.org/)

## Constrained Bayesian Inference for Low Rank Multitask Learning

Koyejo, Oluwasanmi; Ghosh, Joydeep
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.459934%
We present a novel approach for constrained Bayesian inference. Unlike current methods, our approach does not require convexity of the constraint set. We reduce the constrained variational inference to a parametric optimization over the feasible set of densities and propose a general recipe for such problems. We apply the proposed constrained Bayesian inference approach to multitask learning subject to rank constraints on the weight matrix. Further, constrained parameter estimation is applied to recover the sparse conditional independence structure encoded by prior precision matrices. Our approach is motivated by reverse inference for high dimensional functional neuroimaging, a domain where the high dimensionality and small number of examples requires the use of constraints to ensure meaningful and effective models. For this application, we propose a model that jointly learns a weight matrix and the prior inverse covariance structure between different tasks. We present experimental validation showing that the proposed approach outperforms strong baseline models in terms of predictive performance and structure recovery.; Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

## Phase transitions in optimal unsupervised learning

Buhot, Arnaud; Gordon, Mirta B.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.459934%
We determine the optimal performance of learning the orientation of the symmetry axis of a set of P = alpha N points that are uniformly distributed in all the directions but one on the N-dimensional sphere. The components along the symmetry breaking direction, of unitary vector B, are sampled from a mixture of two gaussians of variable separation and width. The typical optimal performance is measured through the overlap Ropt=B.J* where J* is the optimal guess of the symmetry breaking direction. Within this general scenario, the learning curves Ropt(alpha) may present first order transitions if the clusters are narrow enough. Close to these transitions, high performance states can be obtained through the minimization of the corresponding optimal potential, although these solutions are metastable, and therefore not learnable, within the usual bayesian scenario.; Comment: 9 pages, 8 figures, submitted to PRE, This new version of the paper contains one new section, Bayesian versus optimal solutions, where we explain in detail the results supporting our claim that bayesian learning may not be optimal. Figures 4 of the first submission was difficult to understand. We replaced it by two new figures (Figs. 4 and 5 in this new version) containing more details

## Hamiltonian ABC

Meeds, Edward; Leenders, Robert; Welling, Max
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.459934%
Approximate Bayesian computation (ABC) is a powerful and elegant framework for performing inference in simulation-based models. However, due to the difficulty in scaling likelihood estimates, ABC remains useful for relatively low-dimensional problems. We introduce Hamiltonian ABC (HABC), a set of likelihood-free algorithms that apply recent advances in scaling Bayesian learning using Hamiltonian Monte Carlo (HMC) and stochastic gradients. We find that a small number forward simulations can effectively approximate the ABC gradient, allowing Hamiltonian dynamics to efficiently traverse parameter spaces. We also describe a new simple yet general approach of incorporating random seeds into the state of the Markov chain, further reducing the random walk behavior of HABC. We demonstrate HABC on several typical ABC problems, and show that HABC samples comparably to regular Bayesian inference using true gradients on a high-dimensional problem from machine learning.; Comment: Submission to UAI 2015

## Unstable Consumer Learning Models: Structural Estimation and Experimental Examination

Lovett, Mitchell James
Tipo: Dissertação Formato: 792675 bytes; application/pdf
Relevância na Pesquisa
37.431511%

This dissertation explores how consumers learn from repeated experiences with a product offering. It develops a new Bayesian consumer learning model, the unstable learning model. This model expands on existing models that explore learning when quality is stable, by considering when quality is changing. Further, the dissertation examines situations in which consumers may act as if quality is changing when it is stable or vice versa. This examination proceeds in two essays.

The first essay uses two experiments to examine how consumers learn when product quality is stable or changing. By collecting repeated measures of expectation data and experiences, more information enables estimation to discriminate between stable and unstable learning. The key conclusions are that (1) most consumers act as if quality is unstable, even when it is stable, and (2) consumers respond to the environment they face, adjusting their learning in the correct direction. These conclusions have important implications for the formation and value of brand equity.

Based on the conclusions of this first essay, the second essay develops a choice model of consumer learning when consumers believe quality is changing, even though it is not. A Monte Carlo experiment tests the efficacy of this model versus the standard model. The key conclusion is that both models perform similarly well when the model assumptions match the way consumers actually learn...

## A two phase approach to Bayesian network model selection and comparison between the MDL and DGM scoring heuristics

Kane, Michael; Sahin, Ferat; Savakis, Andreas
Fonte: Institute of Electrical and Electronics Engineers (IEEE) Publicador: Institute of Electrical and Electronics Engineers (IEEE)
Tipo: Proceedings
Português
Relevância na Pesquisa
37.431528%
This paper presents an efficient algorithm for learning a Bayesian belief network (BBN) structure from a database, as well as providing a comparison between two BBN structure fitness functions. A Bayesian belief network is a directed acyclic graph representing conditional expectations. In this paper, we propose a two-phase algorithm. The first phase uses asymptotically correct structure learning for efficient search space exploration, while the second phase uses greedy model selection for accurate search space exploration. The minimum description length (MDL) structure fitness function is also compared with the database given model probability (DGM) fitness function in the second phase. The model selection algorithms are applied to the ALARM network to provide a comparison for the accuracy of the techniques.; "A two phase approach to Bayesian network model selection and comparison between the MDL and DGM scoring heuristics," IEEE International Conference on Systems, Man and Cybernetics. Institute of Electrical and Electronics Engineers. Held in Washington, D.C.: 5-8 October 2003. ©2003 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists...

## A two phase approach to Bayesian network model selection and comparison between the MDL and DGM scoring heuristics

Kane, Michael; Ferat, Sahin; Savakis, Andreas
Fonte: Institute of Electrical and Electronics Engineers (IEEE) Publicador: Institute of Electrical and Electronics Engineers (IEEE)
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.431528%
This paper presents an efficient algorithm for learning a Bayesian belief network (BBN) structure from a database, as well as providing a comparison between two BBN structure fitness functions. A Bayesian belief network is a directed acyclic graph representing conditional expectations. In this paper, we propose a two-phase algorithm. The first phase uses asymptotically correct structure learning for efficient search space exploration, while the second phase uses greedy model selection for accurate search space exploration. The minimum description length (MDL) structure fitness function is also compared with the database given model probability (DGM) fitness function in the second phase. The model selection algorithms are applied to the ALARM network to provide a comparison for the accuracy of the techniques.; "A two phase approach to Bayesian network model selection and comparison between the MDL and DGM scoring heuristics," IEEE International Conference on Systems, Man and Cybernetics. Institute of Electrical and Electronics Engineers. Held in Washington, D.C.: 5-8 October 2003. ©2003 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists...