Página 14 dos resultados de 471 itens digitais encontrados em 0.067 segundos

Linear dimensionality reduction applied to SIFT and SURF feature descriptors; Redução linear de dimensionalidade aplicada aos descritores de características SIFT e SURF

Ricardo Eugenio González Valenzuela
Fonte: Biblioteca Digital da Unicamp Publicador: Biblioteca Digital da Unicamp
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 20/02/2014 Português
Relevância na Pesquisa
370.54918%
Descritores locais robustos normalmente compõem-se de vetores de características de alta dimensionalidade para descrever atributos discriminativos em imagens. A alta dimensionalidade de um vetor de características implica custos consideráveis em termos de tempo computacional e requisitos de armazenamento afetando o desempenho de várias tarefas que utilizam descritores de características, tais como correspondência, recuperação e classificação de imagens. Para resolver esses problemas, pode-se aplicar algumas técnicas de redução de dimensionalidade, escencialmente, construindo uma matrix de projeção que explique adequadamente a importancia dos dados em outras bases. Esta dissertação visa aplicar técnicas de redução linear de dimensionalidade aos descritores SIFT e SURF. Seu principal objetivo é demonstrar que, mesmo com o risco de diminuir a precisão dos vetores de caraterísticas, a redução de dimensionalidade pode resultar em um equilíbrio adequado entre tempo computacional e recursos de armazenamento. A redução linear de dimensionalidade é realizada por meio de técnicas como projeções aleatórias (RP), análise de componentes principais (PCA), análise linear discriminante (LDA) e mínimos quadrados parciais (PLS)...

sTeXIDE: An Integrated Development Environment for sTeX Collections

Jucovschi, Constantin; Kohlhase, Michael
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 29/05/2010 Português
Relevância na Pesquisa
371.04156%
Authoring documents in MKM formats like OMDoc is a very tedious task. After years of working on a semantically annotated corpus of sTeX documents (GenCS), we identified a set of common, time-consuming subtasks, which can be supported in an integrated authoring environment. We have adapted the modular Eclipse IDE into sTeXIDE, an authoring solution for enhancing productivity in contributing to sTeX based corpora. sTeXIDE supports context-aware command completion, module management, semantic macro retrieval, and theory graph navigation.; Comment: To appear in The 9th International Conference on Mathematical Knowledge Management: MKM 2010

The CALBC RDF Triple Store: retrieval over large literature content

Croset, Samuel; Grabmüller, Christoph; Li, Chen; Kavaliauskas, Silvestras; Rebholz-Schuhmann, Dietrich
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 07/12/2010 Português
Relevância na Pesquisa
371.04156%
Integration of the scientific literature into a biomedical research infrastructure requires the processing of the literature, identification of the contained named entities (NEs) and concepts, and to represent the content in a standardised way. The CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions (Silver Standard Corpus, SSC). The four semantic groups were chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). The content of the SSC has been fully integrated into RDF Triple Store (4,568,678 triples) and has been aligned with content from the GeneAtlas (182,840 triples), UniProtKb (12,552,239 triples for human) and the lexical resource LexEBI (BioLexicon). RDF Triple Store enables querying the scientific literature and bioinformatics resources at the same time for evidence of genetic causes, such as drug targets and disease involvement.; Comment: in Adrian Paschke, Albert Burger, Andrea Splendiani, M. Scott Marshall, Paolo Romano: Proceedings of the 3rd International Workshop on Semantic Web Applications and Tools for the Life Sciences...

Clustering is Efficient for Approximate Maximum Inner Product Search

Auvolat, Alex; Chandar, Sarath; Vincent, Pascal; Larochelle, Hugo; Bengio, Yoshua
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
371.04156%
Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes. Solutions based on locality-sensitive hashing (LSH) as well as tree-based solutions have been investigated in the recent literature, to perform approximate MIPS in sublinear time. In this paper, we compare these to another extremely simple approach for solving approximate MIPS, based on variants of the k-means clustering algorithm. Specifically, we propose to train a spherical k-means, after having reduced the MIPS problem to a Maximum Cosine Similarity Search (MCSS). Experiments on two standard recommendation system benchmarks as well as on large vocabulary word embeddings, show that this simple approach yields much higher speedups, for the same retrieval precision, than current state-of-the-art hashing-based and tree-based methods. This simple method also yields more robust retrievals when the query is corrupted by noise.; Comment: 10 pages, Under review at ICLR 2016

The Tomaco Hybrid Matching Framework for SAWSDL Semantic Web Services

Stavropoulos, Thanos G.; Andreadis, Stelios; Bassiliades, Nick; Vrakas, Dimitris; Vlahavas, Ioannis
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 21/10/2014 Português
Relevância na Pesquisa
371.04156%
This work aims to resolve issues related to Web Service retrieval, also known as Service Selection, Discovery or essentially Matching, in two directions. Firstly, a novel matching algorithm for SAWSDL is introduced. The algorithm is hybrid in nature, combining novel and known concepts, such as a logic-based strategy and syntactic text-similarity measures on semantic annotations and textual descriptions. A plugin for the S3 contest environment was developed, in order to position Tomaco amongst state-of-the-art in an objective, reproducible manner. Evaluation showed that Tomaco ranks high amongst state of the art, especially for early recall levels. Secondly, this work introduces the Tomaco web application, which aims to accelerate the wide-spread adoption of Semantic Web Service technologies and algorithms while targeting the lack of user-friendly applications in this field. Tomaco integrates a variety of configurable matching algorithms proposed in this paper. It, finally, allows discovery of both existing and user-contributed service collections and ontologies, serving also as a service registry.; Comment: Under review. Keywords: Web Services Discovery, Intelligent Web Services and Semantic Web, Internet reasoning services, Web-based services

Simplifying Complex Software Assembly: The Component Retrieval Language and Implementation

Seidel, Eric L.; Allen, Gabrielle; Brandt, Steven; Löffler, Frank; Schnetter, Erik
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 07/09/2010 Português
Relevância na Pesquisa
371.04156%
Assembling simulation software along with the associated tools and utilities is a challenging endeavor, particularly when the components are distributed across multiple source code versioning systems. It is problematic for researchers compiling and running the software across many different supercomputers, as well as for novices in a field who are often presented with a bewildering list of software to collect and install. In this paper, we describe a language (CRL) for specifying software components with the details needed to obtain them from source code repositories. The language supports public and private access. We describe a tool called GetComponents which implements CRL and can be used to assemble software. We demonstrate the tool for application scenarios with the Cactus Framework on the NSF TeraGrid resources. The tool itself is distributed with an open source license and freely available from our web page.; Comment: 8 pages, 5 figures, TeraGrid 2010

Improving zero-shot learning by mitigating the hubness problem

Dinu, Georgiana; Lazaridou, Angeliki; Baroni, Marco
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
371.04156%
The zero-shot paradigm exploits vector-based word representations extracted from text corpora with unsupervised methods to learn general mapping functions from other feature spaces onto word space, where the words associated to the nearest neighbours of the mapped vectors are used as their linguistic labels. We show that the neighbourhoods of the mapped elements are strongly polluted by hubs, vectors that tend to be near a high proportion of items, pushing their correct labels down the neighbour list. After illustrating the problem empirically, we propose a simple method to correct it by taking the proximity distribution of potential neighbours across many mapped vectors into account. We show that this correction leads to consistent improvements in realistic zero-shot experiments in the cross-lingual, image labeling and image retrieval domains.

Generating Weather Forecast Texts with Case Based Reasoning

Adeyanju, Ibrahim
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 03/09/2015 Português
Relevância na Pesquisa
371.04156%
Several techniques have been used to generate weather forecast texts. In this paper, case based reasoning (CBR) is proposed for weather forecast text generation because similar weather conditions occur over time and should have similar forecast texts. CBR-METEO, a system for generating weather forecast texts was developed using a generic framework (jCOLIBRI) which provides modules for the standard components of the CBR architecture. The advantage in a CBR approach is that systems can be built in minimal time with far less human effort after initial consultation with experts. The approach depends heavily on the goodness of the retrieval and revision components of the CBR process. We evaluated CBRMETEO with NIST, an automated metric which has been shown to correlate well with human judgements for this domain. The system shows comparable performance with other NLG systems that perform the same task.; Comment: 6 pages

Future Robotics Database Management System along with Cloud TPS

S, Vijaykumar; G, Saravanakumar S
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 09/12/2011 Português
Relevância na Pesquisa
371.04156%
This paper deals with memory management issues of robotics. In our proposal we break one of the major issues in creating humanoid. . Database issue is the complicated thing in robotics schema design here in our proposal we suggest new concept called NOSQL database for the effective data retrieval, so that the humanoid robots will get the massive thinking ability in searching each items using chained instructions. For query transactions in robotics we need an effective consistency transactions so by using latest technology called CloudTPS which guarantees full ACID properties so that the robot can make their queries using multi-item transactions through this we obtain data consistency in data retrievals. In addition we included map reduce concepts it can splits the job to the respective workers so that it can process the data in a parallel way.; Comment: 12 pages,5 figures,First model; International Journal on Cloud Computing: Services and Architecture(IJCCSA),Vol.1, No.3, November 2011

Web Usage Mining Using Artificial Ant Colony Clustering and Genetic Programming

Abraham, Ajith; Ramos, Vitorino
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 17/12/2004 Português
Relevância na Pesquisa
371.04156%
The rapid e-commerce growth has made both business community and customers face a new situation. Due to intense competition on one hand and the customer's option to choose from several alternatives business community has realized the necessity of intelligent marketing strategies and relationship management. Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. Web usage mining has become very critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, network traffic flow analysis and so on. The study of ant colonies behavior and their self-organizing capabilities is of interest to knowledge retrieval/management and decision support systems sciences, because it provides models of distributed adaptive organization, which are useful to solve difficult optimization, classification, and distributed control problems, among others. In this paper, we propose an ant clustering algorithm to discover Web usage patterns (data clusters) and a linear genetic programming approach to analyze the visitor trends. Empirical results clearly shows that ant colony clustering performs well when compared to a self-organizing map (for clustering Web usage patterns) even though the performance accuracy is not that efficient when comparared to evolutionary-fuzzy clustering (i-miner) approach. KEYWORDS: Web Usage Mining...

Ontological Representations of Software Patterns

Rosengard, Jean-Marc; Ursu, Marian
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 14/05/2006 Português
Relevância na Pesquisa
371.04156%
This paper is based on and advocates the trend in software engineering of extending the use of software patterns as means of structuring solutions to software development problems (be they motivated by best practice or by company interests and policies). The paper argues that, on the one hand, this development requires tools for automatic organisation, retrieval and explanation of software patterns. On the other hand, that the existence of such tools itself will facilitate the further development and employment of patterns in the software development process. The paper analyses existing pattern representations and concludes that they are inadequate for the kind of automation intended here. Adopting a standpoint similar to that taken in the semantic web, the paper proposes that feasible solutions can be built on the basis of ontological representations.; Comment: 7 pages

Euclidean Distance Matrices: Essential Theory, Algorithms and Applications

Dokmanic, Ivan; Parhizkar, Reza; Ranieri, Juri; Vetterli, Martin
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
371.04156%
Euclidean distance matrices (EDM) are matrices of squared distances between points. The definition is deceivingly simple: thanks to their many useful properties they have found applications in psychometrics, crystallography, machine learning, wireless sensor networks, acoustics, and more. Despite the usefulness of EDMs, they seem to be insufficiently known in the signal processing community. Our goal is to rectify this mishap in a concise tutorial. We review the fundamental properties of EDMs, such as rank or (non)definiteness. We show how various EDM properties can be used to design algorithms for completing and denoising distance data. Along the way, we demonstrate applications to microphone position calibration, ultrasound tomography, room reconstruction from echoes and phase retrieval. By spelling out the essential algorithms, we hope to fast-track the readers in applying EDMs to their own problems. Matlab code for all the described algorithms, and to generate the figures in the paper, is available online. Finally, we suggest directions for further research.; Comment: - 17 pages, 12 figures, to appear in IEEE Signal Processing Magazine - change of title in the last revision

A Dataflow Language for Decentralised Orchestration of Web Service Workflows

Jaradat, Ward; Dearle, Alan; Barker, Adam
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 08/05/2013 Português
Relevância na Pesquisa
371.04156%
Orchestrating centralised service-oriented workflows presents significant scalability challenges that include: the consumption of network bandwidth, degradation of performance, and single points of failure. This paper presents a high-level dataflow specification language that attempts to address these scalability challenges. This language provides simple abstractions for orchestrating large-scale web service workflows, and separates between the workflow logic and its execution. It is based on a data-driven model that permits parallelism to improve the workflow performance. We provide a decentralised architecture that allows the computation logic to be moved "closer" to services involved in the workflow. This is achieved through partitioning the workflow specification into smaller fragments that may be sent to remote orchestration services for execution. The orchestration services rely on proxies that exploit connectivity to services in the workflow. These proxies perform service invocations and compositions on behalf of the orchestration services, and carry out data collection, retrieval, and mediation tasks. The evaluation of our architecture implementation concludes that our decentralised approach reduces the execution time of workflows...

Large Scale Visual Recommendations From Street Fashion Images

Jagadeesh, Vignesh; Piramuthu, Robinson; Bhardwaj, Anurag; Di, Wei; Sundaresan, Neel
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 08/01/2014 Português
Relevância na Pesquisa
371.04156%
We describe a completely automated large scale visual recommendation system for fashion. Our focus is to efficiently harness the availability of large quantities of online fashion images and their rich meta-data. Specifically, we propose four data driven models in the form of Complementary Nearest Neighbor Consensus, Gaussian Mixture Models, Texture Agnostic Retrieval and Markov Chain LDA for solving this problem. We analyze relative merits and pitfalls of these algorithms through extensive experimentation on a large-scale data set and baseline them against existing ideas from color science. We also illustrate key fashion insights learned through these experiments and show how they can be employed to design better recommendation systems. Finally, we also outline a large-scale annotated data set of fashion images (Fashion-136K) that can be exploited for future vision research.

A network file system over HTTP: remote access and modification of files and "files"

Kiselyov, Oleg
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 15/03/2000 Português
Relevância na Pesquisa
371.04156%
The goal of the present HTTPFS project is to enable access to remote files, directories, and other containers through an HTTP pipe. HTTPFS system permits retrieval, creation and modification of these resources as if they were regular files and directories on a local filesystem. The remote host can be any UNIX or Win9x/WinNT box that is capable of running a Perl CGI script and accessible either directly or via a web proxy or a gateway. HTTPFS runs entirely in user space. The current implementation fully supports reading as well as creating, writing, appending, and truncating of files on a remote HTTP host. HTTPFS provides an isolation level for concurrent file access stronger than the one mandated by POSIX file system semantics, closer to that of AFS. Both an API with familiar open(), read(), write(), close(), etc. calls, and an interactive interface, via the popular Midnight Commander file browser, are provided.; Comment: This present document combines a paper and a Freenix Track talk presented at a 1999 USENIX Annual Technical Conference, June 6-11, 1999; Monterey, CA, USA; 6 HTML files. The paper alone appeared in Proc. FREENIX Track: 1999 USENIX Annual Technical Conference, June 6-11,1999; Monterey, CA, USA, pp. 75-80

Two Steps Feature Selection and Neural Network Classification for the TREC-8 Routing

Stricker, Mathieu; Vichot, Frantz; Dreyfus, Gerard; Wolinski, Francis
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 11/07/2000 Português
Relevância na Pesquisa
371.04156%
For the TREC-8 routing, one specific filter is built for each topic. Each filter is a classifier trained to recognize the documents that are relevant to the topic. When presented with a document, each classifier estimates the probability for the document to be relevant to the topic for which it has been trained. Since the procedure for building a filter is topic-independent, the system is fully automatic. By making use of a sample of documents that have previously been evaluated as relevant or not relevant to a particular topic, a term selection is performed, and a neural network is trained. Each document is represented by a vector of frequencies of a list of selected terms. This list depends on the topic to be filtered; it is constructed in two steps. The first step defines the characteristic words used in the relevant documents of the corpus; the second one chooses, among the previous list, the most discriminant ones. The length of the vector is optimized automatically for each topic. At the end of the term selection, a vector of typically 25 words is defined for the topic, so that each document which has to be processed is represented by a vector of term frequencies. This vector is subsequently input to a classifier that is trained from the same sample. After training...

A vulnerability in Google AdSense: Automatic extraction of links to ads

Ochando, Manuel Blázquez
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 25/09/2015 Português
Relevância na Pesquisa
371.04156%
On the basis of the XSS (Cross Site Scripting) and Web Crawler techniques it is possible to go through the barriers of the Google Adsense advertising system by obtaining the validated links of the ads published on a website. Such method involves obtaining the source code built for the Google java applet for publishing and handling ads and for the final link retrieval. Once the links of the ads have been obtained, you can use the user sessions visiting other websites to load such links, in the background, by a simple re-direction, through a hidden iframe, so that the IP addresses clicking are different in each case.; Comment: 8 figures

Institutional repository `eKMAIR': establishing and populating a research repository for the National University "Kyiv Mohyla Academy"

Yaroshenko, Tetiana
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
371.04156%
University libraries have an increasingly important role to play in supporting open access publishing and dissemination of research outputs.1 In particular, many libraries are playing a leading role in establishing and managing institutional repositories. Institutional repositories are, most often, Open Access Initiative (OAI)-compliant databases of a university or other research institution's intellectual output, most typically research papers, although many other forms of digital media can also be stored and disseminated. Their main function is to provide improved access to the full text of research articles and improve retrieval of relevant research. The National University "Kyiv Mohyla Academy" is a small-sized institution with approximately 3,000 students and 500 academic staff. Although it is a teaching-intensive university, developing research and knowledge-transfer capacity is a strategic priority and four research institutes have been established, with further research activity going on in the academic schools and research centres.; Comment: This paper has been administratively withdrawn because it was submitted under false pretenses by someone other than Tetiana Yaroshenko and she did not write any of the text

Deep Belief Nets for Topic Modeling

Maaloe, Lars; Arngren, Morten; Winther, Ole
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 18/01/2015 Português
Relevância na Pesquisa
371.04156%
Applying traditional collaborative filtering to digital publishing is challenging because user data is very sparse due to the high volume of documents relative to the number of users. Content based approaches, on the other hand, is attractive because textual content is often very informative. In this paper we describe large-scale content based collaborative filtering for digital publishing. To solve the digital publishing recommender problem we compare two approaches: latent Dirichlet allocation (LDA) and deep belief nets (DBN) that both find low-dimensional latent representations for documents. Efficient retrieval can be carried out in the latent representation. We work both on public benchmarks and digital media content provided by Issuu, an online publishing platform. This article also comes with a newly developed deep belief nets toolbox for topic modeling tailored towards performance evaluation of the DBN model and comparisons to the LDA model.; Comment: Accepted to the ICML-2014 Workshop on Knowledge-Powered Deep Learning for Text Mining

Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications

Lu, Zhiwu; Ip, Horace H. S.; Peng, Yuxin
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 21/09/2011 Português
Relevância na Pesquisa
371.04156%
This paper presents a novel pairwise constraint propagation approach by decomposing the challenging constraint propagation problem into a set of independent semi-supervised learning subproblems which can be solved in quadratic time using label propagation based on k-nearest neighbor graphs. Considering that this time cost is proportional to the number of all possible pairwise constraints, our approach actually provides an efficient solution for exhaustively propagating pairwise constraints throughout the entire dataset. The resulting exhaustive set of propagated pairwise constraints are further used to adjust the similarity matrix for constrained spectral clustering. Other than the traditional constraint propagation on single-source data, our approach is also extended to more challenging constraint propagation on multi-source data where each pairwise constraint is defined over a pair of data points from different sources. This multi-source constraint propagation has an important application to cross-modal multimedia retrieval. Extensive results have shown the superior performance of our approach.; Comment: The short version of this paper appears as oral paper in ECCV 2010