Página 1 dos resultados de 10714 itens digitais encontrados em 0.043 segundos

Portuguese regional unemployment patterns: a k-means cluster analysis approach

Nunes, Alcina; Barros, Elisa
Fonte: Ovidius University Press Publicador: Ovidius University Press
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
66.550015%
Publicação indexada à REPEC e à DOAJ-Directory of Open Access Journals; The k-means cluster analysis technique is an important ally in the study of economic patterns in a multivariate framework. Aware of its analytical importance this paper adopts such method of study to identify groups of Portuguese administrative regions that share similar patterns regarding the characteristics of unemployed registered individuals. The regional distribution of the unemployed individual characteristics is of core importance for the development of public policies directed to fight the unemployment phenomenon, especially in times of crisis. Preliminary results show a clear division of the territory into four regions – north and south and urban and rural areas - that stresses the importance of designing well-directed public labour policies.

Alzheimer electroencephalogram temporal events detection by K-means

Rodrigues, Pedro Miguel; Freitas, Diamantino; Teixeira, João Paulo
Fonte: Elsevier Publicador: Elsevier
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
66.550015%
Alzheimer Disease (AD) is a chronic progressive and irreversible neurodegenerative brain disorder. Its diagnostic accuracy is relatively low and there is not a biomarker able to detect AD without invasive tests. This study is a new approach to obtained electroencephalogram (EEG) temporal events in order to improve the AD diagnosis. For that, K-means were used and the results suggested that there are sequences of EEG energy variation that appear more frequently in AD patients than in Health subject.

Efficiency issues of evolutionary k-means

NALDI, M. C.; CAMPELLO, R. J. G. B.; HRUSCHKA, E. R.; CARVALHO, A. C. P. L. F.
Fonte: ELSEVIER SCIENCE BV Publicador: ELSEVIER SCIENCE BV
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
66.785405%
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.; CAPES; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES); Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq); CNPq; FAPESP; Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Evolutionary k-means for distributed data sets

Naldi, M. C.; Campello, Ricardo José Gabrielli Barreto
Fonte: Elsevier; Amsterdam Publicador: Elsevier; Amsterdam
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
66.750015%
One of the challenges for clustering resides in dealing with data distributed in separated repositories, because most clustering techniques require the data to be centralized. One of them, k-means, has been elected as one of the most influential data mining algorithms for being simple, scalable and easily modifiable to a variety of contexts and application domains. Although distributed versions of k-means have been proposed, the algorithm is still sensitive to the selection of the initial cluster prototypes and requires the number of clusters to be specified in advance. In this paper, we propose the use of evolutionary algorithms to overcome the k-means limitations and, at the same time, to deal with distributed data. Two different distribution approaches are adopted: the first obtains a final model identical to the centralized version of the clustering algorithm; the second generates and selects clusters for each distributed data subset and combines them afterwards. The algorithms are compared experimentally from two perspectives: the theoretical one, through asymptotic complexity analyses; and the experimental one, through a comparative evaluation of results obtained from a collection of experiments and statistical tests. The obtained results indicate which variant is more adequate for each application scenario.; CNPq; FAPESP; FAPEMIG; Selected papers from the XII Brazilian Symposium on Neural Networks (SBRN 2012). Curitiba...

QK-Means: A clustering technique based on community detection and K-Means for deployment of cluster head nodes

Ferreira, Leonardo N.; Pinto, A. R.; Zhao, Liang
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Conferência ou Objeto de Conferência
Português
Relevância na Pesquisa
66.7804%
Wireless Sensor Networks (WSN) are a special kind of ad-hoc networks that is usually deployed in a monitoring field in order to detect some physical phenomenon. Due to the low dependability of individual nodes, small radio coverage and large areas to be monitored, the organization of nodes in small clusters is generally used. Moreover, a large number of WSN nodes is usually deployed in the monitoring area to increase WSN dependability. Therefore, the best cluster head positioning is a desirable characteristic in a WSN. In this paper, we propose a hybrid clustering algorithm based on community detection in complex networks and traditional K-means clustering technique: the QK-Means algorithm. Simulation results show that QK-Means detect communities and sub-communities thus lost message rate is decreased and WSN coverage is increased. © 2012 IEEE.

Segmentação de individuos no Facebook que gostam de música: abordagem exploratória, recorrendo à comparação entre dois algoritmos, k-means e fuzzy c-means

Quinteiro, José António Teixeira
Fonte: Instituto Superior de Economia e Gestão Publicador: Instituto Superior de Economia e Gestão
Tipo: Dissertação de Mestrado
Publicado em /09/2011 Português
Relevância na Pesquisa
66.80134%
Mestrado em Gestão/MBA; Para se poder definir os melhores planos estratégicos, as decisões de marketing que se têm que tomar, com o intuito de abordar o mercado, escolher a melhor campanha publicitária, seleccionar o segmento e o tipo de produto ou serviço a oferecer, têm que ter por base o resultado de uma boa análise técnica da informação ou dos dados disponíveis. A escolha do método de segmentação, é de primordial importância, pois os dados que se obtêm podem alterar a estratégia de selecção do mercado alvo e a estratégia de posicionamento dos produtos ou serviços, para além dos custos inerentes á tomada da decisão. Este estudo procura encontrar diferenças entre dois métodos de segmentação descritivos post-hoc, (k-means e Fuzzy C-Means), na obtenção dos clusters, tendo por base a população portuguesa que gosta de música e que tem conta activa no Facebook. No âmbito deste trabalho realizou-se uma revisão da literatura conhecida tendo-se efectuado a segmentação da amostra obtida através de dois algoritmos. Complementou-se o estudo com uma análise descritiva das frequências de modo, aquisição e audição dos vários tipos de música.; In order to define the best strategic plans, marketing decisions that have to be taken in order to tackle the market...

Unsupervised grouping of industrial textile dyes using K-means algorithm and optical fibre spectroscopy

Cubillas de Cos, Ana María; Conde Portilla, Olga María; Anuarbe Cortés, Pedro; Quintela Incera, Antonio; López Higuera, José Miguel
Fonte: SPIE Society of Photo-Optical Instrumentation Engineers Publicador: SPIE Society of Photo-Optical Instrumentation Engineers
Tipo: info:eu-repo/semantics/conferenceObject; publishedVersion
Português
Relevância na Pesquisa
66.550015%
A method for the unsupervised clustering of optically thick textile dyes based on their spectral properties is demonstrated in this paper. The system utilizes optical fibre sensor techniques in the Ultraviolet-Visible-Near Infrared (UV-Vis-NIR) to evaluate the absorption spectrum and thus the colour of textile dyes. A multivariate method is first applied to calculate the optimum dilution factor needed to reduce the high absorbance of the dye samples. Then, the grouping algorithm used combines Principal Component Analysis (PCA), for data compression, and K-means for unsupervised clustering of the different dyes. The feasibility of the proposed method for textile applications is also discussed in the paper.

Clasificación mediante k-modas para el caso de variables categóricas

Pastrán Ramírez, Luisa Fernanda; Roa Peña, Nataly Jineth
Fonte: Ibagué: Universidad del Tolima, 2015.; 170 COL CO Publicador: Ibagué: Universidad del Tolima, 2015.; 170 COL CO
Tipo: Trabajo de grado - Pregrado; Text; info:eu-repo/semantics/bachelorThesis; info:eu-repo/semantics/updatedVersion Formato: application/pdf
Português
Relevância na Pesquisa
56.788613%
60 Páginas; La primera parte del trabajo está dedicada a la presentación de los elementos básicos que rigen la teoría de variables categóricas así como a la presentación del método k-means que es el prototipo de técnica utilizada con los cambios convenientes para manejar variables categóricas, dando origen al método de k-modas. La segunda parte proporciona la esencia del método k-modas al igual que el algoritmo que lo implementa y las rutinas de programación necesarias para su aplicación. Finalmente se aplica los temas anteriores a un caso particular de una muestra tomada en Empresas Prestadoras de Salud en el Tolima; ABSTRACT : The first part of the work is devoted to the presentation of the basic elements governing the theory of categorical variables and the presentation of the k -means method is the prototype technique used with suitable changes to handle categorical variables , giving rise to k - fashion method . The second part provides the essence of fashion k - like method that implements the algorithm and programming routines needed for implementation. Finally the above issues apply to a particular case of a sample taken in health firms in Tolima; Agradecimientos 5 Resumen 7 Introducción 8 Objetivos 9 1. PRELIMINARES 12 1.1. Distribución de probabilidad para Variables Categóricas 12 1.2. Clasificación 12 1.2.1. Semejanza...

Uma alternativa de aceleração do algoritmo fuzzy K-Means aplicado à quantização vetorial

Madeiro,F.; Galvão,R.R.A.; Ferreira,F.A.B.S.; Cunha,D.C.
Fonte: Sociedade Brasileira de Matemática Aplicada e Computacional Publicador: Sociedade Brasileira de Matemática Aplicada e Computacional
Tipo: Artigo de Revista Científica Formato: text/html
Publicado em 01/01/2012 Português
Relevância na Pesquisa
66.66925%
Compressão de sinais, marca d'água digital e reconhecimento de padrões são exemplos de aplicações de quantização vetorial (QV). Um problema relevante em QV é o projeto de dicionários. Neste trabalho, é apresentada uma alternativa de aceleração do algoritmo fuzzy K-Means aplicado ao projeto de dicionários. Resultados de simulações envolvendo QV de imagens e de sinais com distribuição de Gauss-Markov mostram que o método proposto leva a um aumento da velocidade de convergência (redução do número de iterações) do algoritmo fuzzy K-Means sem comprometimento da qualidade dos dicionários projetados.

Scalable Kernel Clustering: Approximate Kernel k-means

Chitta, Radha; Jin, Rong; Havens, Timothy C.; Jain, Anil K.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 16/02/2014 Português
Relevância na Pesquisa
56.77296%
Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease of implementation. However, its run-time complexity and memory footprint increase quadratically in terms of the size of the data set, and hence, large data sets cannot be clustered efficiently. In this paper, we propose an approximation scheme based on randomization, called the Approximate Kernel k-means. We approximate the cluster centers using the kernel similarity between a few sampled points and all the points in the data set. We show that the proposed method achieves better clustering performance than the traditional low rank kernel approximation based clustering schemes. We also demonstrate that its running time and memory requirements are significantly lower than those of kernel k-means, with only a small reduction in the clustering quality on several public domain large data sets. We then employ ensemble clustering techniques to further enhance the performance of our algorithm.; Comment: 15 pages, 6 figures,extension of the work "Approximate Kernel k-means: Solution to large scale kernel clustering" published in KDD 2011

Integrating K-means with Quadratic Programming Feature Selection

Prasad, Yamuna; Biswas, K. K.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
56.839272%
Several data mining problems are characterized by data in high dimensions. One of the popular ways to reduce the dimensionality of the data is to perform feature selection, i.e, select a subset of relevant and non-redundant features. Recently, Quadratic Programming Feature Selection (QPFS) has been proposed which formulates the feature selection problem as a quadratic program. It has been shown to outperform many of the existing feature selection methods for a variety of applications. Though, better than many existing approaches, the running time complexity of QPFS is cubic in the number of features, which can be quite computationally expensive even for moderately sized datasets. In this paper we propose a novel method for feature selection by integrating k-means clustering with QPFS. The basic variant of our approach runs k-means to bring down the number of features which need to be passed on to QPFS. We then enhance this idea, wherein we gradually refine the feature space from a very coarse clustering to a fine-grained one, by interleaving steps of QPFS with k-means clustering. Every step of QPFS helps in identifying the clusters of irrelevant features (which can then be thrown away), whereas every step of k-means further refines the clusters which are potentially relevant. We show that our iterative refinement of clusters is guaranteed to converge. We provide bounds on the number of distance computations involved in the k-means algorithm. Further...

Robust seed selection algorithm for k-means type algorithms

Pavan, K. Karteeka; Rao, Allam Appa; Rao, A. V. Dattatreya; Sridhar, G. R.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 07/02/2012 Português
Relevância na Pesquisa
56.70101%
Selection of initial seeds greatly affects the quality of the clusters and in k-means type algorithms. Most of the seed selection methods result different results in different independent runs. We propose a single, optimal, outlier insensitive seed selection algorithm for k-means type algorithms as extension to k-means++. The experimental results on synthetic, real and on microarray data sets demonstrated that effectiveness of the new algorithm in producing the clustering results; Comment: 17 pages, 5 tables, 9figures

Fuzzy soft rough K-Means clustering approach for gene expression data

Dhanalakshmi, K.; Inbarani, H. Hannah
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 21/12/2012 Português
Relevância na Pesquisa
56.70101%
Clustering is one of the widely used data mining techniques for medical diagnosis. Clustering can be considered as the most important unsupervised learning technique. Most of the clustering methods group data based on distance and few methods cluster data based on similarity. The clustering algorithms classify gene expression data into clusters and the functionally related genes are grouped together in an efficient manner. The groupings are constructed such that the degree of relationship is strong among members of the same cluster and weak among members of different clusters. In this work, we focus on a similarity relationship among genes with similar expression patterns so that a consequential and simple analytical decision can be made from the proposed Fuzzy Soft Rough K-Means algorithm. The algorithm is developed based on Fuzzy Soft sets and Rough sets. Comparative analysis of the proposed work is made with bench mark algorithms like K-Means and Rough K-Means and efficiency of the proposed algorithm is illustrated in this work by using various cluster validity measures such as DB index and Xie-Beni index.; Comment: 7 pages, IJSER Vol.3 Issue: 10 Oct 2012

Learning Manifolds with K-Means and K-Flats

Canas, Guillermo D.; Poggio, Tomaso; Rosasco, Lorenzo
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
56.82376%
We study the problem of estimating a manifold from random samples. In particular, we consider piecewise constant and piecewise linear estimators induced by k-means and k-flats, and analyze their performance. We extend previous results for k-means in two separate directions. First, we provide new results for k-means reconstruction on manifolds and, secondly, we prove reconstruction bounds for higher-order approximation (k-flats), for which no known results were previously available. While the results for k-means are novel, some of the technical tools are well-established in the literature. In the case of k-flats, both the results and the mathematical tools are new.; Comment: 19 pages, 2 figures; Advances in Neural Information Processing Systems, NIPS 2012

Embed and Conquer: Scalable Embeddings for Kernel k-Means on MapReduce

Elgohary, Ahmed; Farahat, Ahmed K.; Kamel, Mohamed S.; Karray, Fakhri
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
56.86603%
The kernel $k$-means is an effective method for data clustering which extends the commonly-used $k$-means algorithm to work on a similarity matrix over complex data structures. The kernel $k$-means algorithm is however computationally very complex as it requires the complete data matrix to be calculated and stored. Further, the kernelized nature of the kernel $k$-means algorithm hinders the parallelization of its computations on modern infrastructures for distributed computing. In this paper, we are defining a family of kernel-based low-dimensional embeddings that allows for scaling kernel $k$-means on MapReduce via an efficient and unified parallelization strategy. Afterwards, we propose two methods for low-dimensional embedding that adhere to our definition of the embedding family. Exploiting the proposed parallelization strategy, we present two scalable MapReduce algorithms for kernel $k$-means. We demonstrate the effectiveness and efficiency of the proposed algorithms through an empirical evaluation on benchmark data sets.; Comment: Appears in Proceedings of the SIAM International Conference on Data Mining (SDM), 2014

Performance Evaluation of Incremental K-means Clustering Algorithm

Chakraborty, Sanjay; Nagwani, N. K.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 18/06/2014 Português
Relevância na Pesquisa
56.897886%
The incremental K-means clustering algorithm has already been proposed and analysed in paper [Chakraborty and Nagwani, 2011]. It is a very innovative approach which is applicable in periodically incremental environment and dealing with a bulk of updates. In this paper the performance evaluation is done for this incremental K-means clustering algorithm using air pollution database. This paper also describes the comparison on the performance evaluations between existing K-means clustering and incremental K-means clustering using that particular database. It also evaluates that the particular point of change in the database upto which incremental K-means clustering performs much better than the existing K-means clustering. That particular point of change in the database is known as "Threshold value" or "% delta change in the database". This paper also defines the basic methodology for the incremental K-means clustering algorithm.

Performance Comparison of Incremental K-means and Incremental DBSCAN Algorithms

Chakraborty, Sanjay; Nagwani, N. K.; Dey, Lopamudra
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 18/06/2014 Português
Relevância na Pesquisa
56.70101%
Incremental K-means and DBSCAN are two very important and popular clustering techniques for today's large dynamic databases (Data warehouses, WWW and so on) where data are changed at random fashion. The performance of the incremental K-means and the incremental DBSCAN are different with each other based on their time analysis characteristics. Both algorithms are efficient compare to their existing algorithms with respect to time, cost and effort. In this paper, the performance evaluation of incremental DBSCAN clustering algorithm is implemented and most importantly it is compared with the performance of incremental K-means clustering algorithm and it also explains the characteristics of these two algorithms based on the changes of the data in the database. This paper also explains some logical differences between these two most popular clustering algorithms. This paper uses an air pollution database as original database on which the experiment is performed.

Weather Forecasting using Incremental K-means Clustering

Chakraborty, Sanjay; Nagwani, N. K.; Dey, Lopamudra
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 18/06/2014 Português
Relevância na Pesquisa
56.70101%
Clustering is a powerful tool which has been used in several forecasting works, such as time series forecasting, real time storm detection, flood forecasting and so on. In this paper, a generic methodology for weather forecasting is proposed by the help of incremental K-means clustering algorithm. Weather forecasting plays an important role in day to day applications.Weather forecasting of this paper is done based on the incremental air pollution database of west Bengal in the years of 2009 and 2010. This paper generally uses typical K-means clustering on the main air pollution database and a list of weather category will be developed based on the maximum mean values of the clusters.Now when the new data are coming, the incremental K-means is used to group those data into those clusters whose weather category has been already defined. Thus it builds up a strategy to predict the weather of the upcoming data of the upcoming days. This forecasting database is totally based on the weather of west Bengal and this forecasting methodology is developed to mitigating the impacts of air pollutions and launch focused modeling computations for prediction and forecasts of weather events. Here accuracy of this approach is also measured.

Detection of Masses in Digital Mammograms using K-means and Support Vector Machine

Oliveira Martins, Leonardo de; Braz Junior, Geraldo; Corrêa Silva, Aristófanes; Cardoso de Paiva, Anselmo; Gattass, Marcelo
Fonte: Universidade Autônoma de Barcelona Publicador: Universidade Autônoma de Barcelona
Tipo: Artigo de Revista Científica Formato: application/pdf
Publicado em //2009 Português
Relevância na Pesquisa
66.550015%
Breast cancer is a serious public health problem in several countries. Computer Aided Detection/Diagnosis systems (CAD/CADx) have been used with relative success aiding health care professionals. The goal of such systems is contribute on the specialist task aiding in the detection of different types of cancer at an early stage. This work presents a methodology for masses detection on digitized mammograms using the K-means algorithm for image segmentation and co-occurrence matrix to describe the texture of segmented structures. Classification of these structures is accomplished through Support Vector Machines, which separate them in two groups, using shape and texture descriptors: masses and non-masses. The methodology obtained 85% of accuracy.

Selección de Electrodos Basada en k-means para la Clasificación de Actividad Motora en EEG

Lemuz-López,R.; Gómez-López,W.; Ayaquica-Martínez,I.; Guillén-Galván,C.
Fonte: Sociedad Mexicana de Ingeniería Biomédica Publicador: Sociedad Mexicana de Ingeniería Biomédica
Tipo: Artigo de Revista Científica Formato: text/html
Publicado em 01/01/2014 Português
Relevância na Pesquisa
66.550015%
Se presenta un algoritmo para la selección del grupo de electrodos relacionados con la imaginación de movimiento. El algoritmo utiliza la técnica de agrupamiento llamada k-means para formar grupos de sensores y selecciona el grupo que corresponde a la actividad correlacionada más alta. Para evaluar la selección de electrodos, se calcula el indice de clasificación aplicando la descomposición proyectiva llamada patrones espaciales comunes y un discriminante lineal en una prueba de una sola época para identificar la imaginación del movimiento de mano izquierda vs pie derecho. Esta propuesta reduce significativamente el número de electrodos de 118 a 35, además de mejorar el índice de clasificación.