Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
3,249
result(s) for
"Data release"
Sort by:
Early Results from GLASS-JWST. XVIII. A First Morphological Atlas of the 1 < z < 5 Universe in the Rest-frame Optical
2023
We present a rest-frame optical morphological analysis of galaxies observed with the NIRCam imager on the James Webb Space Telescope (JWST) as part of the GLASS-JWST Early Release Science program. We select 388 sources at redshifts 0.8 < z < 5.4 and use the seven 0.9–5 μm NIRCam filters to generate rest-frame gri composite color images, and conduct visual morphological classification. Compared to Hubble Space Telescope (HST)–based work we find a higher incidence of disks and bulges than expected at z > 1.5, revealed by rest-frame optical imaging. We detect 123 clear disks (58 at z > 1.5) of which 76 have bulges. No evolution of bulge fraction with redshift is evident: 61% at z < 2 (N = 110) versus 60% at z ≥ 2 (N = 13). A stellar mass dependence is evident, with bulges visible in 80% of all disk galaxies with mass >109.5 M ⊙ (N = 41) but only 52% at M < 109.5 M ⊙ (N = 82). We supplement visual morphologies with nonparametric measurements of Gini and asymmetry coefficients in the rest-frame i band. Our sources are more asymmetric than local galaxies, with slightly higher Gini values. When compared to high-z rest-frame ultraviolet measurements with HST, JWST shows more regular morphological types such as disks, bulges, and spiral arms at z > 1.5, with smoother (i.e., lower Gini) and more symmetrical light distributions.
Journal Article
Bounds on the sample complexity for private learning and private data release
by
Kasiviswanathan, Shiva Prasad
,
Beimel, Amos
,
Brenner, Hai
in
Algorithmics. Computability. Computer arithmetics
,
Applied sciences
,
Artificial Intelligence
2014
Learning is a task that generalizes many of the analyses that are applied to collections of data, in particular, to collections of sensitive individual information. Hence, it is natural to ask what can be learned while preserving individual privacy. Kasiviswanathan et al. (in SIAM J. Comput., 40(3):793–826,
2011
) initiated such a discussion. They formalized the notion of
private learning
, as a combination of PAC learning and differential privacy, and investigated what concept classes can be learned privately. Somewhat surprisingly, they showed that for finite, discrete domains (ignoring time complexity), every PAC learning task could be performed privately with polynomially many labeled examples; in many natural cases this could even be done in polynomial time.
While these results seem to equate non-private and private learning, there is still a significant gap: the sample complexity of (non-private) PAC learning is crisply characterized in terms of the VC-dimension of the concept class, whereas this relationship is lost in the constructions of private learners, which exhibit, generally, a higher sample complexity.
Looking into this gap, we examine several private learning tasks and give tight bounds on their sample complexity. In particular, we show strong separations between sample complexities of proper and improper private learners (such separation does not exist for non-private learners), and between sample complexities of efficient and inefficient proper private learners. Our results show that VC-dimension is not the right measure for characterizing the sample complexity of proper private learning.
We also examine the task of
private data release
(as initiated by Blum et al. in STOC, pp. 609–618,
2008
), and give new lower bounds on the sample complexity. Our results show that the logarithmic dependence on size of the instance space is essential for private data release.
Journal Article
Views of Ethical Best Practices in Sharing Individual-Level Data From Medical and Public Health Research
by
Bull, Susan
,
Roberts, Nia
,
Parker, Michael
in
Attitude
,
Best practice
,
Biomedical Research - ethics
2015
There is increasing support for sharing individual-level data generated by medical and public health research. This scoping review of empirical research and conceptual literature examined stakeholders’ perspectives of ethical best practices in data sharing, particularly in low- and middle-income settings. Sixty-nine empirical and conceptual articles were reviewed, of which, only five were empirical studies and eight were conceptual articles focusing on low- and middle-income settings. We conclude that support for sharing individual-level data is contingent on the development and implementation of international and local policies and processes to support ethical best practices. Further conceptual and empirical research is needed to ensure data sharing policies and processes in low- and middle-income settings are appropriately informed by stakeholders’ perspectives.
Journal Article
Genotype Data and Derived Genetic Instruments of Adolescent Brain Cognitive Development Study® for Better Understanding of Human Brain Development
by
Fan, Chun Chieh
,
Friedman, Naomi
,
LeBlanc, Kimberly
in
Adolescent development
,
Adolescents
,
Brain
2023
The data release of Adolescent Brain Cognitive Development® (ABCD) Study represents an extensive resource for investigating factors relating to child development and mental wellbeing. The genotype data of ABCD has been used extensively in the context of genetic analysis, including genome-wide association studies and polygenic score predictions. However, there are unique opportunities provided by ABCD genetic data that have not yet been fully tapped. The diverse genomic variability, the enriched relatedness among ABCD subsets, and the longitudinal design of the ABCD challenge researchers to perform novel analyses to gain deeper insight into human brain development. Genetic instruments derived from the ABCD genetic data, such as genetic principal components, can help to better control confounds beyond the context of genetic analyses. To facilitate the use genomic information in the ABCD for inference, we here detail the processing procedures, quality controls, general characteristics, and the corresponding resources in the ABCD genotype data of release 4.0.
Journal Article
Design and Implementation of a Novel IoT Architecture for Data Release System Between Multiple Platforms: Case of Smart Offshores
by
Atangana, Jacques
,
Wang, Lei
,
Tabi Fouda, Bernard Marie
in
Aquaculture
,
Automation
,
Communication
2025
The evolution of automation has reached marine operations in general and offshore operations in particular. Many facilities in these areas use the Internet of Things (IoT) to consolidate processes and improve data release systems. In addition, the IEC60870-5-104 protocol (IEC104) enables remote data release. This paper introduces and develops a novel IoT architecture that enables the continuous acquisition, evaluation, and release of data between platforms. Continuous data release is based on a dynamic configuration (DC) approach using the IEC104 protocol (DC-IEC104). The proposed approach thoroughly analyzes the structural model and communication process and then proposes a set of design tables according to the information object (type and amount) of the data to be released. In the application case, the data of the photoelectric composite submarine cables were successfully released with an average mean square error of 3.78 and an average processing time of 1.083 s. These results have been proven to be better compared to those obtained using three other approaches for data release.
Journal Article
Toward Answering Federated Spatial Range Queries Under Local Differential Privacy
2024
Federated analytics (FA) over spatial data with local differential privacy (LDP) has attracted considerable research attention recently. Existing solutions for this problem mostly employ a uniform grid (UG) structure, which recursively decomposes the whole spatial domain into fine‐grained regions in the distributed setting. In each round, the sampled clients perturb their locations using a random response mechanism with a fixed probability. This approach, however, cannot encode the client’s location effectively and will lead to ill‐suited query results. To address the deficiency of existing solutions, we propose LDP‐FSRQ, a spatial range query algorithm that relies on a hybrid spatial structure composed of the UG and quad‐tree with nonuniform perturbation (NUP) probability to encode and perturb clients’ locations. In each iteration of LDP‐FSRQ, each client adopts the quad‐tree to encode his/her location into a binary string and uses four local perturbation mechanisms to protect the encoded string. Then, the collector prunes the quad‐tree of the current round according to the clients’ reports and shares the pruned tree with the clients of the next round. We demonstrate the application of LDP‐FSRQ on Beijing, Landmark, Check‐in, and NYC datasets, and the experimental results show that our approach outperforms its competitors in terms of queries’ utility.
Journal Article
Sharing Research Data in Collaborative Material Science and Engineering Projects
2025
The objective of this paper was to examine the potential for the automated release of research data within the context of material science consortium projects, with adherence to the stipulated rules and contractual agreements, while also considering all relevant framework conditions, including those pertaining to protection and confidentiality. The investigation further aimed to explore the utilisation of the release process as a means for ensuring the quality of research data, employing an integrated review procedure. The study commenced with an examination of the regulations governing the sharing and reusing of research data, and the associated benefits. The fist phase of the study involved an evaluation of current release processes in common research data infrastructures. This was followed by the development of a methodological approach to the release of research data according to the needs of researcher in collaborative projects, such as automation of the release process, quality queries, reusability of data and more. The implementation of the main functions of this methodological approach was then undertaken in Kadi4Mat, an open-source research data infrastructure originally developed in the context of materials science. This implementation took the form of a prototypical and modular plugin.
Journal Article
MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry version 1; peer review: 1 approved, 2 approved with reservations
by
Snutch, Terrance P
,
Eccles, David A
,
Zalunin, Vadim
in
Archives & records
,
Base pairs
,
Clonal deletion
2017
Background: Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of
Escherichia
coli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry.
Methods: We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads.
Results: Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional (\"2D\") N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of \"pass\" template reads from 1-dimensional (\"1D\") experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D \"pass\" reads from 9.1% in R7.3 to 7.5% in R9.0 and for template \"pass\" reads from 26.7% in R7.3 to 14.5% in R9.0.
Conclusions: These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology.
Abbreviations: K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).
Journal Article
Research and application of privacy protection technology based on big data environment
2021
With the development of the Internet of Things, the popularization of the mobile Internet, and the rapid promotion of social networks, the growth of data has entered a big explosion, and the development of information technology has caused a torrent of big data. The continuous development of technology brings people convenience, speed and comfort, but also hides hidden worries. Combined with differential privacy and clustering, a clustering-based differential privacy universal dataset release method for mixed datasets is proposed: using k-prototype clustering algorithm to group records in the hybrid dataset to reduce differential privacy Query sensitivity and the amount of noise to be added to improve data utility while providing data privacy protection; perform attribute difference calculations for numerical attributes and categorical attributes combined with weights, and measure their information loss separately. Finally, Experimental results show that this algorithm can improve the usability of data publishing.
Journal Article
The Bermuda Triangle: The Pragmatics, Policies, and Principles for Data Sharing in the History of the Human Genome Project
by
Ankeny, Rachel A.
,
Jones, Kathryn Maxson
,
Cook-Deegan, Robert
in
BASIC BIOLOGICAL SCIENCES
,
Bayh-Dole Act
,
Bermuda
2018
The Bermuda Principles for DNA sequence data sharing are an enduring legacy of the Human Genome Project (HGP). They were adopted by the HGP at a strategy meeting in Bermuda in February of 1996 and implemented in formal policies by early 1998, mandating daily release of HGP-funded DNA sequences into the public domain. The idea of daily sharing, we argue, emanated directly from strategies for large, goal-directed molecular biology projects first tested within the \"community\" of C. elegans researchers, and were introduced and defended for the HGP by the nematode biologists John Sulston and Robert Waterston. In the C. elegans community, and subsequently in the HGP, daily sharing served the pragmatic goals of quality control and project coordination. Yet in the HGP human genome, we also argue, the Bermuda Principles addressed concerns about gene patents impeding scientific advancement, and were aspirational and flexible in implementation and justification. They endured as an archetype for how rapid data sharing could be realized and rationalized, and permitted adaptation to the needs of various scientific communities. Yet in addition to the support of Sulston and Waterston, their adoption also depended on the clout of administrators at the US National Institutes of Health (NIH) and the UK nonprofit charity the Wellcome Trust, which together funded 90% of the HGP human sequencing effort. The other nations wishing to remain in the HGP consortium had to accommodate to the Bermuda Principles, requiring exceptions from incompatible existing or pending data access policies for publicly funded research in Germany, Japan, and France. We begin this story in 1963, with the biologist Sydney Brenner's proposal for a nematode research program at the Laboratory of Molecular Biology (LMB) at the University of Cambridge. We continue through 2003, with the completion of the HGP human reference genome, and conclude with observations about policy and the historiography of molecular biology.
Journal Article