Página 1 dos resultados de 26 itens digitais encontrados em 0.006 segundos

MR-Radix: a multi-relational data mining algorithm

Valêncio, Carlos ; Oyama, Fernando ; Scarpelini Neto, Paulo ; Colombini, Angelo ; Cansian, Adriano ; Souza, Rogéria de; Corrêa, Pedro Luiz Pizzigatti
Fonte: Biblioteca Digital da Produção Intelectual da USP Publicador: Biblioteca Digital da Produção Intelectual da USP
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
120.60001%
Abstract Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine...

Comparative study of algorithms for mining association rules: Traditional approach versus multi-relational approach

Valêncio, Carlos Roberto; Oyama, Fernando Takeshi; Neto, Paulo Scarpelini; De Souza, Rogéria Cristiane Gratão
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Conferência ou Objeto de Conferência Formato: 275-280
Português
Relevância na Pesquisa
110.60314%
The multi-relational Data Mining approach has emerged as alternative to the analysis of structured data, such as relational databases. Unlike traditional algorithms, the multi-relational proposals allow mining directly multiple tables, avoiding the costly join operations. In this paper, is presented a comparative study involving the traditional Patricia Mine algorithm and its corresponding multi-relational proposed, MR-Radix in order to evaluate the performance of two approaches for mining association rules are used for relational databases. This study presents two original contributions: the proposition of an algorithm multi-relational MR-Radix, which is efficient for use in relational databases, both in terms of execution time and in relation to memory usage and the presentation of the empirical approach multirelational advantage in performance over several tables, which avoids the costly join operations from multiple tables. © 2011 IEEE.

Multi-relational algorithm for mining association rules in large databases

Valêncio, Carlos Roberto; Oyama, Fernando Takeshi; Ichiba, Fernando Tochio; De Souza, Rogéria Cristiane Gratão
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Conferência ou Objeto de Conferência Formato: 269-274
Português
Relevância na Pesquisa
110.39793%
Multi-relational data mining enables pattern mining from multiple tables. The existing multi-relational mining association rules algorithms are not able to process large volumes of data, because the amount of memory required exceeds the amount available. The proposed algorithm MRRadix presents a framework that promotes the optimization of memory usage. It also uses the concept of partitioning to handle large volumes of data. The original contribution of this proposal is enable a superior performance when compared to other related algorithms and moreover successfully concludes the task of mining association rules in large databases, bypass the problem of available memory. One of the tests showed that the MR-Radix presents fourteen times less memory usage than the GFP-growth. © 2011 IEEE.

Mineração multirrelacional de regras de associação em grandes bases de dados

Oyama, Fernando Takeshi
Fonte: Universidade Estadual Paulista (UNESP) Publicador: Universidade Estadual Paulista (UNESP)
Tipo: Dissertação de Mestrado Formato: 126 f. : il.
Português
Relevância na Pesquisa
90.28593%
Pós-graduação em Ciência da Computação - IBILCE; O crescente avanço e a disponibilidade de recursos computacionais viabilizam o armazenamento e a manipulação de grandes bases de dados. As técnicas típicas de mineração de dados possibilitam a extração de padrões desde que os dados estejam armazenados em uma única tabela. A mineração de dados multirrelacional, por sua vez, apresenta-se como uma abordagem mais recente que permite buscar padrões provenientes de múltiplas tabelas, sendo indicada para a aplicação em bases de dados relacionais. No entanto, os algoritmos multirrelacionais de mineração de regras de associação existentes tornam-se impossibilitados de efetuar a tarefa de mineração em grandes volumes de dados, uma vez que a quantia de memória exigida para a conclusão do processamento ultrapassa a quantidade disponível. O objetivo do presente trabalho consiste em apresentar um algoritmo multirrelacional de extração de regras de associação com o foco na aplicação em grandes bases de dados relacionais. Para isso, o algoritmo proposto, MR-RADIX, apresenta uma estrutura denominada Radix-tree que representa comprimidamente a base de dados em memória. Além disso, o algoritmo utiliza-se do conceito de particionamento para subdividir a base de dados...

MR-Radix: a multi-relational data mining algorithm

Valencio, Carlos R.; Oyama, Fernando T.; Scarpelini, Paulo; Colombini, Angelo C.; Souza, Rogeria C. G. de; Cansian, Adriano Mauro
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
120.60001%
Background: Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods: Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results: This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion: The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that...

Spatial Data Mining to Support Environmental Management and Decision Making - A Case Study in Brazil

Valêncio, Carlos Roberto; Ichiba, Fernando Tochio; Daniel, Guilherme Prióli; Souza, Rogeria Cristiane Gratão de; Neves, Leandro Alves; Colombini, Angelo Cesar
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Artigo de Revista Científica Formato: 25-32
Português
Relevância na Pesquisa
48.637007%

k-RNN: k-Relational Nearest Neighbour Algorithm

fonseca, nuno a.; costa, vitor santos; rocha, ricardo; camacho, rui
Fonte: Universidade do Porto Publicador: Universidade do Porto
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
58.766167%
The amount of data collected and stored in databases is growing considerably in almost all areas of human activity. In complex applications the data involves several relations and proposionalization is not a suitable approach. Multi-Relational Data Mining algorithms can analyze data from multiple relations, with no need to transform the data into a single table, but are computationally more expensive. In this paper a novel relational classification algorithm based on the k-nearest neighbour algorithm is presented and evaluated.

ILP: Compute Once, Reuse Often

Nuno A Fonseca; Ricardo Rocha; Rui Camacho; Vítor Santos Costa
Fonte: Universidade do Porto Publicador: Universidade do Porto
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
48.115444%
Inductive Logic Programming (ILP) is a powerful and welldeveloped abstraction for multi-relational data mining techniques. However, ILP systems are not particularly fast, most of their execution time is spent evaluating the hypotheses they construct. The evaluation time needed to assess the quality of each hypothesis depends mainly on the number of examples and the theorem proving effort required to determine if an example is entailed by the hypothesis. We propose a technique that reduces the theorem proving effort to a bare minimum and stores valuable information to compute the number of examples entailed by each hypothesis (using a tree data structure). The information is computed only once (pre-compiled) per example. Evaluation of hypotheses requires only basic and efficient operations on trees. This proposal avoids re-computation of hypothesis#8217; value in theory-level search and cross-validation algorithms, whenever the same data set is used with different parameters. In an empirical evaluation the technique yielded considerable speedups.

Improving the efficiency of inductive logic programming systems

fonseca, nuno a.; costa, vitor santos; rocha, ricardo; camacho, rui; silva, fernando
Fonte: Universidade do Porto Publicador: Universidade do Porto
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
48.185073%
Inductive logic programming (ILP) is a sub-field of machine learning that provides an excellent framework for multi-relational data mining applications. The advantages of ILP have been successfully demonstrated in complex and relevant industrial and scientific problems. However, to produce valuable models, ILP systems often require long running times and large amounts of memory. In this paper we address fundamental issues that have direct impact on the efficiency of ILP systems. Namely, we discuss how improvements in the indexing mechanisms of an underlying logic programming system benefit ILP performance. Furthermore, we propose novel data structures to reduce memory requirements and we suggest a new lazy evaluation technique to search the hypothesis space more efficiently. These proposals have been implemented in the April ILP system and evaluated using several well-known data sets. The results observed show significant improvements in running time without compromising the accuracy of the models generated. Indeed, the combined techniques achieve several order of magnitudes speedup in some data sets. Moreover, memory requirements are reduced in nearly half of the data sets. Copyright (C) 2008 John Wiley Sons, Ltd.

A pipelined data-parallel algorithm for ILP

fonseca, nuno a.; silva, fernando; costa, vitor santos; camacho, rui
Fonte: Universidade do Porto Publicador: Universidade do Porto
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
58.566416%
The amount of data collected and stored in databases is growing considerably for almost all areas of human activity. Processing this amount of data is very expensive, both humanly and computationally. This justifies the increased interest both on the automatic discovery of useful knowledge from databases, and on using parallel processing for this task. Multi Relational Data Mining (MRDM) techniques, such as Inductive Logic Programming (ILP), can learn rules from relational databases consisting of multiple tables. However current ILP systems are designed to run in main memory and can have long running times. We propose a pipelined data-parallel algorithm for ILP. The algorithm was implemented and evaluated on a commodity PC cluster with 8 processors. The results show that our algorithm yields excellent speedups, while preserving the quality of learning.

Study on the Absorbed Fingerprint-Efficacy of Yuanhu Zhitong Tablet Based on Chemical Analysis, Vasorelaxation Evaluation and Data Mining

Xu, Haiyu; Li, Ke; Chen, Yanjun; Zhang, Yingchun; Tang, Shihuan; Wang, Shanshan; Shen, Dan; Wang, Xuguang; Lei, Yun; Li, Defeng; Zhang, Yi; Jin, Lan; Yang, Hongjun; Huang, Luqi
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 10/12/2013 Português
Relevância na Pesquisa
48.39027%
Yuanhu Zhitong Tablet (YZT) is an example of a typical and relatively simple clinical herb formula that is widely used in clinics. It is generally believed that YZT play a therapeutical effect in vivo by the synergism of multiple constituents. Thus, it is necessary to build the relationship between the absorbed fingerprints and bioactivity so as to ensure the quality, safety and efficacy. In this study, a new combinative method, an intestinal absorption test coupled with a vasorelaxation bioactivity experiment in vitro, was a simple, sensitive, and feasible technique to study on the absorbed fingerprint-efficacy of YZT based on chemical analysis, vasorelaxation evaluation and data mining. As part of this method, an everted intestinal sac method was performed to determine the intestinal absorption of YZT solutions. YZT were dissolved in solution (n = 12), and the portion of the solution that was absorbed into intestinal sacs was analyzed using rapid-resolution liquid chromatography coupled with quadruple time-of-flight mass spectrometry (RRLC-Q-TOF/MS). Semi-quantitative analysis indicated the presence of 34 compounds. The effect of the intestinally absorbed solution on vasorelaxation of rat aortic rings with endothelium attached was then evaluated in vitro. The results showed that samples grouped by HCA from chemical profiles have similar bioactivity while samples in different groups displayed very different. Moreover...

Novel drug target identification for the treatment of dementia using multi-relational association mining

Nguyen, Thanh-Phuong; Priami, Corrado; Caberlotto, Laura
Fonte: Nature Publishing Group Publicador: Nature Publishing Group
Tipo: Artigo de Revista Científica
Publicado em 08/07/2015 Português
Relevância na Pesquisa
69.01392%
Dementia is a neurodegenerative condition of the brain in which there is a progressive and permanent loss of cognitive and mental performance. Despite the fact that the number of people with dementia worldwide is steadily increasing and regardless of the advances in the molecular characterization of the disease, current medical treatments for dementia are purely symptomatic and hardly effective. We present a novel multi-relational association mining method that integrates the huge amount of scientific data accumulated in recent years to predict potential novel targets for innovative therapeutic treatment of dementia. Owing to the ability of processing large volumes of heterogeneous data, our method achieves a high performance and predicts numerous drug targets including several serine threonine kinase and a G-protein coupled receptor. The predicted drug targets are mainly functionally related to metabolism, cell surface receptor signaling pathways, immune response, apoptosis, and long-term memory. Among the highly represented kinase family and among the G-protein coupled receptors, DLG4 (PSD-95), and the bradikynin receptor 2 are highlighted also for their proposed role in memory and cognition, as described in previous studies. These novel putative targets hold promises for the development of novel therapeutic approaches for the treatment of dementia.

Modular multi-relational framework for gene group function prediction

García-Jiménez, Beatriz; Ledezma, Agapito; Sanchis, Araceli
Fonte: Universidade Carlos III de Madrid Publicador: Universidade Carlos III de Madrid
Tipo: Aula Formato: application/pdf; text/plain
Publicado em //2009 Português
Relevância na Pesquisa
89.34276%
Determining the functions of genes is essential for understanding how the metabolisms work, and for trying to solve their malfunctions. Genes usually work in groups rather than isolated, so functions should be assigned to gene groups and not to individual genes. Moreover, the genetic knowledge has many relations and is very frequently changeable. Thus, a propositional ad-hoc approach is not appropriate to deal with the gene group function prediction domain. We propose the Modular Multi-Relational Framework (MMRF), which faces the problem from a relational and flexible point of view. The MMRF consists of several modules covering all involved domain tasks (grouping, representing and learning using computational prediction techniques). A specific application is described, including a relational representation language, where each module of MMRF is individually instantiated and refined for obtaining a prediction under specific given conditions.; The research reported here has been supported by CICYT, TRA2007-67374-C02-02 project.; Poster of: 19th International Conference on Inductive Logic Programming (ILP 2009), Leuven, Belgium, 2 - 4 Jul, 2009

S.cerevisiae complex function prediction with modular multi-relational framework

García-Jiménez, Beatriz; Ledezma, Agapito; Sanchis, Araceli
Fonte: Springer Publicador: Springer
Tipo: Conferência ou Objeto de Conferência Formato: text/plain; application/pdf
Publicado em //2010 Português
Relevância na Pesquisa
89.41021%
Determining the functions of genes is essential for understanding how the metabolisms work, and for trying to solve their malfunctions. Genes usually work in groups rather than isolated, so functions should be assigned to gene groups and not to individual genes. Moreover, the genetic knowledge has many relations and is very frequently changeable. Thus, a propositional ad-hoc approach is not appropriate to deal with the gene group function prediction domain. We propose the Modular Multi-Relational Framework (MMRF), which faces the problem from a relational and flexible point of view. The MMRF consists of several modules covering all involved domain tasks (grouping, representing and learning using computational prediction techniques). A specific application is described, including a relational representation language, where each module of MMRF is individually instantiated and refined for obtaining a prediction under specific given conditions.; This research work has been supported by CICYT, TRA 2007-67374-C02-02 project and by the expert biological knowledge of the Structural Computational Biology Group in Spanish National Cancer Research Centre (CNIO). The authors would like to thank members of Tilde tool developer group in K.U.Leuven for providing their help and many useful suggestions.; Proceeding of: 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems...

MMRF for proteome annotation applied to human protein disease prediction

García-Jiménez, Beatriz; Ledezma, Agapito; Sanchis, Araceli
Fonte: Springer Publicador: Springer
Tipo: info:eu-repo/semantics/acceptedVersion; info:eu-repo/semantics/conferenceObject; info:eu-repo/semantics/bookPart Formato: application/pdf
Publicado em //2011 Português
Relevância na Pesquisa
69.52883%
Biological processes where every gene and protein participates is an essential knowledge for designing disease treatments. Nowadays, these annotations are still unknown for many genes and proteins. Since making annotations from in-vivo experiments is costly, computational predictors are needed for different kinds of annotation such as metabolic pathway, interaction network, protein family, tissue, disease and so on. Biological data has an intrinsic relational structure, including genes and proteins, which can be grouped by many criteria. This hinders the possibility of finding good hypotheses when attribute-value representation is used. Hence, we propose the generic Modular Multi-Relational Framework (MMRF) to predict different kinds of gene and protein annotation using Relational Data Mining (RDM). The specific MMRF application to annotate human protein with diseases verifies that group knowledge (mainly protein-protein interaction pairs) improves the prediction, particularly doubling the area under the precision-recall curve; Proceedings of: 20th International Conference, ILP 2010, Florence, Italy, June 27-30, 2010

Fast Search for Dynamic Multi-Relational Graphs

Choudhury, Sutanay; Holder, Lawrence; Chin, George; Feo, John
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 11/06/2013 Português
Relevância na Pesquisa
49.12124%
Acting on time-critical events by processing ever growing social media or news streams is a major technical challenge. Many of these data sources can be modeled as multi-relational graphs. Continuous queries or techniques to search for rare events that typically arise in monitoring applications have been studied extensively for relational databases. This work is dedicated to answer the question that emerges naturally: how can we efficiently execute a continuous query on a dynamic graph? This paper presents an exact subgraph search algorithm that exploits the temporal characteristics of representative queries for online news or social media monitoring. The algorithm is based on a novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the structural and semantic characteristics of the underlying multi-relational graph. The paper concludes with extensive experimentation on several real-world datasets that demonstrates the validity of this approach.; Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM), 2013

Multi Relational Data Mining Approaches: A Data Mining Technique

Padhy, Neelamadhab; Panigrahi, Rasmita
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 16/11/2012 Português
Relevância na Pesquisa
90.33861%
The multi relational data mining approach has developed as an alternative way for handling the structured data such that RDBMS. This will provides the mining in multiple tables directly. In MRDM the patterns are available in multiple tables (relations) from a relational database. As the data are available over the many tables which will affect the many problems in the practice of the data mining. To deal with this problem, one either constructs a single table by Propositionalisation, or uses a Multi-Relational Data Mining algorithm. MRDM approaches have been successfully applied in the area of bioinformatics. Three popular pattern finding techniques classification, clustering and association are frequently used in MRDM. Multi relational approach has developed as an alternative for analyzing the structured data such as relational database. MRDM allowing applying directly in the data mining in multiple tables. To avoid the expensive joining operations and semantic losses we used the MRDM technique. This paper focuses some of the application areas of MRDM and feature directions as well as the comparison of ILP, GM, SSDM and MRDM; Comment: 10 pages, 1 Figure, 3 Tables "Published with International Journal of Computer Applications (IJCA)"

Granular association rules for multi-valued data

Min, Fan; Zhu, William
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 06/05/2013 Português
Relevância na Pesquisa
48.52619%
Granular association rule is a new approach to reveal patterns hide in many-to-many relationships of relational databases. Different types of data such as nominal, numeric and multi-valued ones should be dealt with in the process of rule mining. In this paper, we study multi-valued data and develop techniques to filter out strong however uninteresting rules. An example of such rule might be "male students rate movies released in 1990s that are NOT thriller." This kind of rules, called negative granular association rules, often overwhelms positive ones which are more useful. To address this issue, we filter out negative granules such as "NOT thriller" in the process of granule generation. In this way, only positive granular association rules are generated and strong ones are mined. Experimental results on the movielens data set indicate that most rules are negative, and our technique is effective to filter them out.; Comment: Proceedings of The 2013 Canadian Conference on Electrical and Computer Engineering (to appear)

Interesting Multi-Relational Patterns

Spyropoulou, Eirini; De Bie, Tijl
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
70.34952%
Mining patterns from multi-relational data is a problem attracting increasing interest within the data mining community. Traditional data mining approaches are typically developed for highly simplified types of data, such as an attribute-value table or a binary database, such that those methods are not directly applicable to multi-relational data. Nevertheless, multi-relational data is a more truthful and therefore often also a more powerful representation of reality. Mining patterns of a suitably expressive syntax directly from this representation, is thus a research problem of great importance. In this paper we introduce a novel approach to mining patterns in multi-relational data. We propose a new syntax for multi-relational patterns as complete connected subgraphs in a representation of the database as a K-partite graph. We show how this pattern syntax is generally applicable to multirelational data, while it reduces to well-known tiles [7] when the data is a simple binary or attribute-value table. We propose RMiner, an efficient algorithm to mine such patterns, and we introduce a method for quantifying their interestingness when contrasted with prior information of the data miner. Finally, we illustrate the usefulness of our approach by discussing results on real-world and synthetic databases.; Comment: Accepted at ICDM'11

Structural advances for pattern discovery in multi-relational databases

Kanodia, Juveria
Fonte: Rochester Instituto de Tecnologia Publicador: Rochester Instituto de Tecnologia
Tipo: Tese de Doutorado Formato: 15299 bytes; 1045261 bytes; 148995 bytes; 1 bytes; 2061 bytes; 698 bytes; 5468 bytes; 49 bytes; 15299 bytes; 1045261 bytes; application/pdf; application/pdf; text/plain; text/plain; text/plain; application/octet-stream; application/octet-stream; applicati
Português
Relevância na Pesquisa
110.83564%
With ever-growing storage needs and drift towards very large relational storage settings, multi-relational data mining has become a prominent and pertinent field for discovering unique and interesting relational patterns. As a consequence, a whole suite of multi-relational data mining techniques is being developed. These techniques may either be extensions to the already existing single-table mining techniques or may be developed from scratch. For the traditionalists, single-table mining algorithms can be used to work on multi-relational settings by making inelegant and time consuming joins of all target relations. However, complex relational patterns cannot be expressed in a single-table format and thus, cannot be discovered. This work presents a new multi-relational frequent pattern mining algorithm termed Multi-Relational Frequent Pattern Growth (MRFP Growth). MRFP Growth is capable of mining multiple relations, linked with referential integrity, for frequent patterns that satisfy a user specified support threshold. Empirical results on MRFP Growth performance and its comparison with the state-of-the-art multirelational data mining algorithms like WARMR and Decentralized Apriori are discussed at length. MRFP Growth scores over the latter two techniques in number of patterns generated and speed. The realm of multi-relational clustering is also explored in this thesis. A multi-Relational Item Clustering approach based on Hypergraphs (RICH) is proposed. Experimentally RICH combined with MRFP Growth proves to be a competitive approach for clustering multi-relational data. The performance and iii quality of clusters generated by RICH are compared with other clustering algorithms. Finally...