Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
97
result(s) for
"Algorismes"
Sort by:
Comparison of algorithms for the detection of cancer drivers at subgene resolution
by
Porta-Pardo, Eduard
,
Valencia, Alfonso
,
López Bigas, Núria
in
Algorismes
,
Carcinogènesi
,
Cromosomes
2017
Understanding genetic events that lead to cancer initiation and progression remains one of the biggest challenges in cancer biology. Traditionally, most algorithms for cancer-driver identification look for genes that have more mutations than expected from the average background mutation rate. However, there is now a wide variety of methods that look for nonrandom distribution of mutations within proteins as a signal for the driving role of mutations in cancer. Here we classify and review such subgene-resolution algorithms, compare their findings on four distinct cancer data sets from The Cancer Genome Atlas and discuss how predictions from these algorithms can be interpreted in the emerging paradigms that challenge the simple dichotomy between driver and passenger genes.
E.P.-P. and A.G. acknowledge the support from the Cancer Center grants P30 CA030199 (to our institute) and R35 GM118187 (A.G.). A.K. was supported by startup funds of G.G. and by a collaboration with Bayer AG. D.T. is supported by project SAF2015-74072-JIN, which is funded by the Agencia Estatal de Investigacion (AEI) and Fondo Europeo de Desarrollo Regional (FEDER). N.L.-B. acknowledges funding from the European Research Council (consolidator grant 682398). A.V. and T.P. acknowledge funding by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 305444 (RD-Connect)
Journal Article
SVM-RFE: selection and visualization of the most relevant features through non-linear kernels
2018
Background
Support vector machines (SVM) are a powerful tool to analyze data with a number of predictors approximately equal or larger than the number of observations. However, originally, application of SVM to analyze biomedical data was limited because SVM was not designed to evaluate importance of predictor variables. Creating predictor models based on only the most relevant variables is essential in biomedical research. Currently, substantial work has been done to allow assessment of variable importance in SVM models but this work has focused on SVM implemented with linear kernels. The power of SVM as a prediction model is associated with the flexibility generated by use of non-linear kernels. Moreover, SVM has been extended to model survival outcomes. This paper extends the Recursive Feature Elimination (RFE) algorithm by proposing three approaches to rank variables based on non-linear SVM and SVM for survival analysis.
Results
The proposed algorithms allows visualization of each one the RFE iterations, and hence, identification of the most relevant predictors of the response variable. Using simulation studies based on time-to-event outcomes and three real datasets, we evaluate the three methods, based on pseudo-samples and kernel principal component analysis, and compare them with the original SVM-RFE algorithm for non-linear kernels. The three algorithms we proposed performed generally better than the gold standard RFE for non-linear kernels, when comparing the truly most relevant variables with the variable ranks produced by each algorithm in simulation studies. Generally, the RFE-pseudo-samples outperformed the other three methods, even when variables were assumed to be correlated in all tested scenarios.
Conclusions
The proposed approaches can be implemented with accuracy to select variables and assess direction and strength of associations in analysis of biomedical data using SVM for categorical or time-to-event responses. Conducting variable selection and interpreting direction and strength of associations between predictors and outcomes with the proposed approaches, particularly with the RFE-pseudo-samples approach can be implemented with accuracy when analyzing biomedical data. These approaches, perform better than the classical RFE of Guyon for realistic scenarios about the structure of biomedical data.
Journal Article
Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes
2015
Benjamin Raphael and colleagues report an analysis of altered subnetworks of somatic aberrations in TCGA pan-cancer data sets, including 3,281 samples from 12 cancer types, using a newly developed HotNet2 algorithm. They identify 16 significantly mutated subnetworks and provide a more comprehensive view into altered pathways, including those with known roles in cancer development.
Cancers exhibit extensive mutational heterogeneity, and the resulting long-tail phenomenon complicates the discovery of genes and pathways that are significantly mutated in cancer. We perform a pan-cancer analysis of mutated networks in 3,281 samples from 12 cancer types from The Cancer Genome Atlas (TCGA) using HotNet2, a new algorithm to find mutated subnetworks that overcomes the limitations of existing single-gene, pathway and network approaches. We identify 16 significantly mutated subnetworks that comprise well-known cancer signaling pathways as well as subnetworks with less characterized roles in cancer, including cohesin, condensin and others. Many of these subnetworks exhibit co-occurring mutations across samples. These subnetworks contain dozens of genes with rare somatic mutations across multiple cancers; many of these genes have additional evidence supporting a role in cancer. By illuminating these rare combinations of mutations, pan-cancer network analyses provide a roadmap to investigate new diagnostic and therapeutic opportunities across cancer types.
Journal Article
Learnheuristics: hybridizing metaheuristics with machine learning for optimization with dynamic inputs
by
Masip, David
,
Armas, Jésica de
,
Juan, Angel A.
in
algorismes híbrids
,
Algorithms
,
algoritmos híbridos
2017
This paper reviews the existing literature on the combination of metaheuristics with machine learning methods and then introduces the concept of learnheuristics, a novel type of hybrid algorithms. Learnheuristics can be used to solve combinatorial optimization problems with dynamic inputs (COPDIs). In these COPDIs, the problem inputs (elements either located in the objective function or in the constraints set) are not fixed in advance as usual. On the contrary, they might vary in a predictable (non-random) way as the solution is partially built according to some heuristic-based iterative process. For instance, a consumer’s willingness to spend on a specific product might change as the availability of this product decreases and its price rises. Thus, these inputs might take different values depending on the current solution configuration. These variations in the inputs might require from a coordination between the learning mechanism and the metaheuristic algorithm: at each iteration, the learning method updates the inputs model used by the metaheuristic.
Journal Article
Fast Computation and Applications of Genome Mappability
by
Derrien, Thomas
,
Ribeca, Paolo
,
Knowles, David G.
in
Algorismes
,
Algorithms
,
Biochemistry, Molecular Biology
2012
We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the level of the gene family, providing for various organisms tracks which allow the mappability information to be visually explored. In addition, we show that mappability varies greatly between species and gene classes. Finally, we suggest several practical applications where mappability can be used to refine the analysis of high-throughput sequencing data (SNP calling, gene expression quantification and paired-end experiments). This work highlights mappability as an important concept which deserves to be taken into full account, in particular when massively parallel sequencing technologies are employed. The GEM mappability program belongs to the GEM (GEnome Multitool) suite of programs, which can be freely downloaded for any use from its website (http://gemlibrary.sourceforge.net).
Journal Article
Predicting motor insurance claims using telematics data-XGBoost versus logistic regression
by
Guillén, Montserrat
,
Alcañiz, Manuela
,
Pesantez-Narvaez, Jessica
in
Algorismes
,
Algorithms
,
Anàlisi de regressió
2019
XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contained information from an insurance company about the individuals' driving patterns-including total annual distance driven and percentage of total distance driven in urban areas. Our findings showed that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards to interpretation.
Journal Article
Network-based features for retinal fundus vessel structure analysis
by
Amil, Pablo
,
Sendiña-Nadal, Irene
,
Guzmán-Vargas, Lev
in
Algorismes
,
Algorismes de segmentació
,
Algorithms
2019
Retinal fundus imaging is a non-invasive method that allows visualizing the structure of the blood vessels in the retina whose features may indicate the presence of diseases such as diabetic retinopathy (DR) and glaucoma. Here we present a novel method to analyze and quantify changes in the retinal blood vessel structure in patients diagnosed with glaucoma or with DR. First, we use an automatic unsupervised segmentation algorithm to extract a tree-like graph from the retina blood vessel structure. The nodes of the graph represent branching (bifurcation) points and endpoints, while the links represent vessel segments that connect the nodes. Then, we quantify structural differences between the graphs extracted from the groups of healthy and non-healthy patients. We also use fractal analysis to characterize the extracted graphs. Applying these techniques to three retina fundus image databases we find significant differences between the healthy and non-healthy groups (p-values lower than 0.005 or 0.001 depending on the method and on the database). The results are sensitive to the segmentation method (manual or automatic) and to the resolution of the images.
Journal Article
An Adaptive Remeshing Procedure for Proximity Maneuvering Spacecraft Formations
2012
We consider a methodology to optimally obtain reconfigurations of spacecraft formations. It is based on the discretization of the time interval in subintervals (called the mesh) and the obtainment of local solutions on them as a result of a variational method. Applied to a libration point orbit scenario, in this work we focus on how to find optimal meshes using an adaptive remeshing procedure and on the determination of the parameter that governs it
Journal Article
Model-Free Reconstruction of Excitatory Neuronal Connectivity from Calcium Imaging Signals
by
Soriano, Jordi
,
Geisel, Theo
,
Battaglia, Demian
in
Algorismes computacionals
,
Animals
,
Bioinformatics
2012
A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically infeasible, even in simpler systems like dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct structural connectivity from network activity monitored through calcium imaging. We focus in this study on the inference of excitatory synaptic links. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the functional network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the network in which the inferred functional connectivity is shaped by monosynaptic excitatory connections, rather than by collective synchrony. Our method can discriminate between actual causal influences between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good estimation of the excitatory network clustering coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we demonstrate the applicability of our method to analyses of real recordings of in vitro disinhibited cortical cultures where we suggest that excitatory connections are characterized by an elevated level of clustering compared to a random graph (although not extreme) and can be markedly non-local.
Journal Article
On the Use of Biased-Randomized Algorithms for Solving Non-Smooth Optimization Problems
by
Corlu, Canan Gunes
,
Ferrer, Albert
,
Tordecilla, Rafael David
in
Algorismes
,
algorismes aleatoris esbiaixats
,
Algorithms
2020
Soft constraints are quite common in real-life applications. For example, in freight transportation, the fleet size can be enlarged by outsourcing part of the distribution service and some deliveries to customers can be postponed as well; in inventory management, it is possible to consider stock-outs generated by unexpected demands; and in manufacturing processes and project management, it is frequent that some deadlines cannot be met due to delays in critical steps of the supply chain. However, capacity-, size-, and time-related limitations are included in many optimization problems as hard constraints, while it would be usually more realistic to consider them as soft ones, i.e., they can be violated to some extent by incurring a penalty cost. Most of the times, this penalty cost will be nonlinear and even noncontinuous, which might transform the objective function into a non-smooth one. Despite its many practical applications, non-smooth optimization problems are quite challenging, especially when the underlying optimization problem is NP-hard in nature. In this paper, we propose the use of biased-randomized algorithms as an effective methodology to cope with NP-hard and non-smooth optimization problems in many practical applications. Biased-randomized algorithms extend constructive heuristics by introducing a nonuniform randomization pattern into them. Hence, they can be used to explore promising areas of the solution space without the limitations of gradient-based approaches, which assume the existence of smooth objective functions. Moreover, biased-randomized algorithms can be easily parallelized, thus employing short computing times while exploring a large number of promising regions. This paper discusses these concepts in detail, reviews existing work in different application areas, and highlights current trends and open research lines.
Journal Article