Página 6 dos resultados de 2413 itens digitais encontrados em 0.011 segundos

## Physarum Learner: A bio-inspired way of learning structure from data

Schön, T.; Stetter, M.; Tomé, A. M.; Puntonet, C. G.; Lang, E. W.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.56512%
A novel Score-based Physarum Learner algorithm for learning Bayesian Network structure from data is introduced and shown to outperform common score based structure learning algorithms for some benchmark data sets. The Score-based Physarum Learner first initializes a fully connected Physarum-Maze with random conductances. In each Physarum Solver iteration, the source and sink nodes are changed randomly, and the conductances are updated. Connections exceeding a predefined conductance threshold are considered as Bayesian Network edges, and the score of the connected nodes are examined in both directions. A positive or negative feedback is given to the edge conductance based on the calculated scores. Due to randomness in selecting connections for evaluation, an ensemble of Score-based Physarum Learner is used to build the final Bayesian Network structure.

## Gene regulatory netwok reconstrution by bayesian integration of prior knowledge and/ordifferent experimental conditions

Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.56512%
There have been various attempts to improve the reconstruction of gene regulatory networks from microarray data by the systematic integration of biological prior knowledge. Our approach is based on pioneering work by Imoto et al.11 where the prior knowledge is expressed in terms of energy functions, from which a prior distribution over network structures is obtained in the form of a Gibbs distribution. The hyperparameters of this distribution represent the weights associated with the prior knowledge relative to the data. We have derived and tested a Markov chain Monte Carlo (MCMC) scheme for sampling networks and hyperparameters simultaneously from the posterior distribution, thereby automatically learning how to trade off information from the prior knowledge and the data. We have extended this approach to a Bayesian coupling scheme for learning gene regulatory networks from a combination of related data sets, which were obtained under different experimental conditions and are therefore potentially associated with different active subpathways. The proposed coupling scheme is a compromise between (1) learning networks from the different subsets separately, whereby no information between the different experiments is shared; and (2) learning networks from a monolithic fusion of the individual data sets...

## Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

Gal, Yarin; Ghahramani, Zoubin
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.579844%
Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes. A direct result of this theory gives us tools to model uncertainty with dropout NNs -- extracting information from existing models that has been thrown away so far. This mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy. We perform an extensive study of the properties of dropout's uncertainty. Various network architectures and non-linearities are assessed on tasks of regression and classification, using MNIST as an example. We show a considerable improvement in predictive log-likelihood and RMSE compared to existing state-of-the-art methods, and finish by using dropout's uncertainty in deep reinforcement learning.; Comment: 11 pages, 6 figures; Minor corrections in experiments section

## PAC-Bayesian Analysis of Martingales and Multiarmed Bandits

Seldin, Yevgeny; Laviolette, François; Shawe-Taylor, John; Peters, Jan; Auer, Peter
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.6098%
We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound concentration of martingale values. Our second approach is based on integration of Hoeffding-Azuma inequality with PAC-Bayesian analysis. We also introduce a way to apply PAC-Bayesian analysis in situation of limited feedback. We combine the new tools to derive PAC-Bayesian generalization and regret bounds for the multiarmed bandit problem. Although our regret bound is not yet as tight as state-of-the-art regret bounds based on other well-established techniques, our results significantly expand the range of potential applications of PAC-Bayesian analysis and introduce a new analysis tool to reinforcement learning and many other fields, where martingales and limited feedback are encountered.

## Bayesian Efficient Multiple Kernel Learning

Gonen, Mehmet
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.579844%
Multiple kernel learning algorithms are proposed to combine kernels in order to obtain a better similarity measure or to integrate feature representations coming from different data sources. Most of the previous research on such methods is focused on the computational efficiency issue. However, it is still not feasible to combine many kernels using existing Bayesian approaches due to their high time complexity. We propose a fully conjugate Bayesian formulation and derive a deterministic variational approximation, which allows us to combine hundreds or thousands of kernels very efficiently. We briefly explain how the proposed method can be extended for multiclass learning and semi-supervised learning. Experiments with large numbers of kernels on benchmark data sets show that our inference method is quite fast, requiring less than a minute. On one bioinformatics and three image recognition data sets, our method outperforms previously reported results with better generalization performance.; Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

## A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Brochu, Eric; Cora, Vlad M.; de Freitas, Nando
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.6098%
We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences.

## Bayesian Model Averaging Using the k-best Bayesian Network Structures

Tian, Jin; He, Ru; Ram, Lavanya
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.6098%
We study the problem of learning Bayesian network structures from data. We develop an algorithm for finding the k-best Bayesian network structures. We propose to compute the posterior probabilities of hypotheses of interest by Bayesian model averaging over the k-best Bayesian networks. We present empirical results on structural discovery over several real and synthetic data sets and show that the method outperforms the model selection method and the state of-the-art MCMC methods.; Comment: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

## Approximate Learning in Complex Dynamic Bayesian Networks

Settimi, Raffaella; Smith, Jim Q.; Gargoum, A. S.
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.617983%
In this paper we extend the work of Smith and Papamichail (1999) and present fast approximate Bayesian algorithms for learning in complex scenarios where at any time frame, the relationships between explanatory state space variables can be described by a Bayesian network that evolve dynamically over time and the observations taken are not necessarily Gaussian. It uses recent developments in approximate Bayesian forecasting methods in combination with more familiar Gaussian propagation algorithms on junction trees. The procedure for learning state parameters from data is given explicitly for common sampling distributions and the methodology is illustrated through a real application. The efficiency of the dynamic approximation is explored by using the Hellinger divergence measure and theoretical bounds for the efficacy of such a procedure are discussed.; Comment: Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999)

## Heteroscedastic Treed Bayesian Optimisation

Assael, John-Alexander M.; Wang, Ziyu; Shahriari, Bobak; de Freitas, Nando
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.617983%
Optimising black-box functions is important in many disciplines, such as tuning machine learning models, robotics, finance and mining exploration. Bayesian optimisation is a state-of-the-art technique for the global optimisation of black-box functions which are expensive to evaluate. At the core of this approach is a Gaussian process prior that captures our belief about the distribution over functions. However, in many cases a single Gaussian process is not flexible enough to capture non-stationarity in the objective function. Consequently, heteroscedasticity negatively affects performance of traditional Bayesian methods. In this paper, we propose a novel prior model with hierarchical parameter learning that tackles the problem of non-stationarity in Bayesian optimisation. Our results demonstrate substantial improvements in a wide range of applications, including automatic machine learning and mining exploration.

## Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks

Hernández-Lobato, José Miguel; Adams, Ryan P.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.617983%
Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.

## Near-Optimal Bayesian Active Learning with Noisy Observations

Golovin, Daniel; Krause, Andreas; Ray, Debajyoti
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.617983%
We tackle the fundamental problem of Bayesian active learning with noise, where we need to adaptively select from a number of expensive tests in order to identify an unknown hypothesis sampled from a known prior distribution. In the case of noise-free observations, a greedy algorithm called generalized binary search (GBS) is known to perform near-optimally. We show that if the observations are noisy, perhaps surprisingly, GBS can perform very poorly. We develop EC2, a novel, greedy active learning algorithm and prove that it is competitive with the optimal policy, thus obtaining the first competitiveness guarantees for Bayesian active learning with noisy observations. Our bounds rely on a recently discovered diminishing returns property called adaptive submodularity, generalizing the classical notion of submodular set functions to adaptive policies. Our results hold even if the tests have non-uniform cost and their noise is correlated. We also propose EffECXtive, a particularly fast approximation of EC2, and evaluate it on a Bayesian experimental design problem involving human subjects, intended to tease apart competing economic theories of how people make decisions under uncertainty.; Comment: 15 pages. Version 2 contains only one major change...

Wang, Yu-Xiang; Fienberg, Stephen E.; Smola, Alex
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.617983%
We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility. Specifically, we show that that under standard assumptions, getting one single sample from a posterior distribution is differentially private "for free". We will see that estimator is statistically consistent, near optimal and computationally tractable whenever the Bayesian model of interest is consistent, optimal and tractable. Similarly but separately, we show that a recent line of works that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an "anytime" algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.

## Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints

Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.617983%
Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via $\ell_1$ optimization is proposed and studied in numerical experiments.; Comment: 38 pages, 6 figures, 2 tables, applications in topic models and Bayesian networks are studied. Simulation section is added

## Practical Bayesian Optimization of Machine Learning Algorithms

Snoek, Jasper; Larochelle, Hugo; Adams, Ryan P.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.602437%
Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation...

## Stochastic complexity of Bayesian networks

Yamazaki, Keisuke; Watanbe, Sumio
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.6098%
Bayesian networks are now being used in enormous fields, for example, diagnosis of a system, data mining, clustering and so on. In spite of their wide range of applications, the statistical properties have not yet been clarified, because the models are nonidentifiable and non-regular. In a Bayesian network, the set of its parameter for a smaller model is an analytic set with singularities in the space of large ones. Because of these singularities, the Fisher information matrices are not positive definite. In other words, the mathematical foundation for learning was not constructed. In recent years, however, we have developed a method to analyze non-regular models using algebraic geometry. This method revealed the relation between the models singularities and its statistical properties. In this paper, applying this method to Bayesian networks with latent variables, we clarify the order of the stochastic complexities.Our result claims that the upper bound of those is smaller than the dimension of the parameter space. This means that the Bayesian generalization error is also far smaller than that of regular model, and that Schwarzs model selection criterion BIC needs to be improved for Bayesian networks.; Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

## Unification of field theory and maximum entropy methods for learning probability densities

Kinney, Justin B.
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.6098%
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory is thus seen to provide a natural test of the validity of the maximum entropy null hypothesis. Bayesian field theory also returns a lower entropy density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided. Based on these results, I argue that Bayesian field theory is poised to provide a definitive solution to the density estimation problem in one dimension.; Comment: 16 pages...

## Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization

Hoffman, Matthew W.; Shahriari, Bobak; de Freitas, Nando
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
37.6098%
We address the problem of finding the maximizer of a nonlinear smooth function, that can only be evaluated point-wise, subject to constraints on the number of permitted function evaluations. This problem is also known as fixed-budget best arm identification in the multi-armed bandit literature. We introduce a Bayesian approach for this problem and show that it empirically outperforms both the existing frequentist counterpart and other Bayesian optimization methods. The Bayesian approach places emphasis on detailed modelling, including the modelling of correlations among the arms. As a result, it can perform well in situations where the number of arms is much larger than the number of allowed function evaluation, whereas the frequentist counterpart is inapplicable. This feature enables us to develop and deploy practical applications, such as automatic machine learning toolboxes. The paper presents comprehensive comparisons of the proposed approach, Thompson sampling, classical Bayesian optimization techniques, more recent Bayesian bandit approaches, and state-of-the-art best arm identification methods. This is the first comparison of many of these methods in the literature and allows us to examine the relative merits of their different features.

## On the Sample Complexity of Learning Bayesian Networks

Friedman, Nir; Yakhini, Zohar
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
37.602437%
In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an epsilon-close approximation (in terms of entropy distance) with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog (1/delta)). This means that the sample complexity is a low-order polynomial in the error threshold and sub-linear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. Finally, we address questions of asymptotic minimality and propose a method for using the sample complexity results to speed up the learning process.; Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996)

## Bayesian Multiregression Dynamic Models with Applications in Finance and Business

Zhao, Yi
Tipo: Dissertação
Relevância na Pesquisa
37.6098%

This thesis discusses novel developments in Bayesian analytics for high-dimensional multivariate time series. The focus is on the class of multiregression dynamic models (MDMs), which can be decomposed into sets of univariate models processed in parallel yet coupled for forecasting and decision making. Parallel processing greatly speeds up the computations and vastly expands the range of time series to which the analysis can be applied.

I begin by defining a new sparse representation of the dependence between the components of a multivariate time series. Using this representation, innovations involve sparse dynamic dependence networks, idiosyncrasies in time-varying auto-regressive lag structures, and flexibility of discounting methods for stochastic volatilities.

For exploration of the model space, I define a variant of the Shotgun Stochastic Search (SSS) algorithm. Under the parallelizable framework, this new SSS algorithm allows the stochastic search to move in each dimension simultaneously at each iteration, and thus it moves much faster to high probability regions of model space than does traditional SSS.

For the assessment of model uncertainty in MDMs, I propose an innovative method that converts model uncertainties from the multivariate context to the univariate context using Bayesian Model Averaging and power discounting techniques. I show that this approach can succeed in effectively capturing time-varying model uncertainties on various model parameters...

Zhou, Mingyuan