[arXiv] BigDataFr recommends: Amplifying Inter-message Distance: On Information Divergence Measures in Big Data

BigDataFr recommends: Amplifying Inter-message Distance: On Information Divergence Measures in Big Data […] Subjects: Information Theory (cs.IT) Message identification (M-I) divergence is an important measure of the information distance between probability distributions, similar to Kullback-Leibler (K-L) and Renyi divergence. In fact, M-I divergence with a variable parameter can make an effect on characterization of distinction […]

[arXiv] BigDataFr recommends: Visualization of Big Spatial Data using Coresets for Kernel Density Estimates

BigDataFr recommends: Visualization of Big Spatial Data using Coresets for Kernel Density Estimates […] Subjects: Human-Computer Interaction (cs.HC); Computational Geometry (cs.CG) The size of large, geo-located datasets has reached scales where visualization of all data points is inefficient. Random sampling is a method to reduce the size of a dataset, yet it can introduce unwanted […]

[arXiv] BigDataFr recommends: A European research roadmap for optimizing societal impact of big data on environment and energy efficiency

BigDataFr recommends: A European research roadmap for optimizing societal impact of big data on environment and energy efficiency […] We present a roadmap to guide European research efforts towards a socially responsible big data economy that maximizes the positive impact of big data in environment and energy efficiency. The goal of the roadmap is to […]

[arXiv] BigDataFr recommends: Massively-Parallel Feature Selection for Big Data

BigDataFr recommends: Massively-Parallel Feature Selection for Big Data […] We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as […]

[arXiv] BigDataFr recommends: Strategies for Big Data Analytics through Lambda Architectures in Volatile Environments

BigDataFr recommends: Strategies for Big Data Analytics through Lambda Architectures in Volatile Environments […] Expectations regarding the future growth of Internet of Things (IoT)-related technologies are high. These expectations require the realization of a sustainable general purpose application framework that is capable to handle these kinds of environments with their complexity in terms of heterogeneity […]

[arXiv – Ariane Carrance] BigDataFr recommends: Uniform random colored complexes

BigDataFr recommends: Uniform random colored complexes […] We present here random distributions on (D+1)-edge-colored, bipartite graphs with a fixed number of vertices 2p. These graphs are dual to D-dimensional orientable colored complexes. We investigate the behavior of quantities related to those random graphs, such as their number of connected components or the number of vertices […]

[arXiv] BigDataFr recommends: Bayesian Nonlinear Support Vector Machines for Big Data

BigDataFr recommends: Bayesian Nonlinear Support Vector Machines for Big Data […] We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional […]

[arXiv] BigDataFr recommends: Big Data vs. complex physical models: a scalable inference algorithm

BigDataFr recommends: Big Data vs. complex physical models: a scalable inference algorithm […] The data torrent unleashed by current and upcoming instruments requires scalable analysis methods. Machine Learning approaches scale well. However, separating the instrument measurement from the physical effects of interest, dealing with variable errors, and deriving parameter uncertainties is usually an after-thought. Classic […]

[arXiv] BigDataFr recommends: A K-means clustering algorithm for multivariate big data with correlated components

BigDataFr recommends: A K-means clustering algorithm for multivariate big data with correlated components […] Common clustering algorithms require multiple scans of all the data to achieve convergence, and this is prohibitive when large databases, with millions of data, must be processed. Some algorithms to extend the popular K-means method to the analysis of big data […]

[arXiv] BigDataFr recommends: From Big Data to Big Displays: High-Performance Visualization at Blue Brain

BigDataFr recommends: From Big Data to Big Displays: High-Performance Visualization at Blue Brain […] Blue Brain has pushed high-performance visualization (HPV) to complement its HPC strategy since its inception in 2007. In 2011, this strategy has been accelerated to develop innovative visualization solutions through increased funding and strategic partnerships with other research institutions. We present […]