Catalogue Search | MBRL

Exploring deep neural networks via layer-peeled model

by Long, Qi , Fang, Cong , He, Hangfeng in Applied Mathematics , Artificial neural networks , Computer applications

2021

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

Journal Article

Share this book

Add to My Shelf

SLOPE IS ADAPTIVE TO UNKNOWN SPARSITY AND ASYMPTOTICALLY MINIMAX

by Su, Weijie , Candès, Emmanuel in Asymptotic methods , Mathematical problems , Measurement errors

2016

We consider high-dimensional sparse regression problems in which we observe y = Xβ + z, where X is an n × p design matrix and z is an n-dimensional vector of independent Gaussian errors, each with variance σ². Our focus is on the recently introduced SLOPE estimator [Ann. Appl. Stat. 9 (2015) 1103-1140], which regularizes the least-squares estimates with the rank-dependent penalty${\\Sigma _{1 \\leqslant i \\leqslant {p^{{\\lambda _i}}}|\\hat \\beta {|_{(i)}}$, |β̂|(i) is the ith largest magnitude of the fitted coefficients. Under Gaussian designs, where the entries of X are i.i.d. N(0,1/n), we show that SLOPE, with weights λi just about equal to σ· Φ⁻¹ (1 — iq/(2p)) [Φ⁻¹(α) is the orth quantile of a standard normal and q is a fixed number in (0,1)] achieves a squared error of estimation obeying sup ℙ(||β̂SLOPE - β||² > (1 + ε)2σ²klog(p/k))→0 ||β||₀≤k as the dimension p increases to ∞, and where ε > 0 is an arbitrary small constant. This holds under a weak assumption on the l₀-sparsity level, namely, k/p → 0 and (k log p)/n → 0, and is sharp in the sense that this is the best possible error any estimator can achieve. A remarkable feature is that SLOPE does not require any knowledge of the degree of sparsity, and yet automatically adapts to yield optimal total squared errors over a wide range of l₀-sparsity classes. We are not aware of any other estimator with this property.

Journal Article

Share this book

Add to My Shelf

Tackling copyright issues in AI image generation through originality estimation and genericization

by Su, Weijie J. , Chiba-Okabe, Hiroaki in 639/705/1042 , 639/705/117 , Copyright

2025

The rapid progress of generative AI technology has sparked significant copyright concerns, leading to numerous lawsuits filed against AI developers. Notably, generative AI’s capacity for generating images of copyrighted characters has been well documented in the literature, and while various techniques for mitigating copyright issues have been studied, significant risks remain. Here, we propose a genericization method that modifies the outputs of a generative model to make them more generic and less likely to imitate distinctive features of copyrighted materials. To achieve this, we introduce a metric for quantifying the level of originality of data, estimated by drawing samples from a generative model, and applied in the genericization process. As a practical implementation, we introduce PREGen (Prompt Rewriting-Enhanced Genericization), which combines our genericization method with an existing mitigation technique. Compared to the existing method, PREGen reduces the likelihood of generating copyrighted characters by more than half when the names of copyrighted characters are used as the prompt. Additionally, while generative models can produce copyrighted characters even when their names are not directly mentioned in the prompt, PREGen almost entirely prevents the generation of such characters in these cases. Ultimately, this study advances computational approaches for quantifying and strengthening copyright protection, thereby providing practical methodologies to promote responsible generative AI development.

Journal Article

Share this book

Add to My Shelf

Envisioning future deep learning theories: some basic concepts and characteristics

by Su, Weijie J. in Back propagation , Computer Science , Deep learning

2024

To advance deep learning methodologies in the next decade, a theoretical framework for reasoning about modern neural networks is needed. While efforts are increasing toward demystifying why deep learning is so effective, a comprehensive picture remains lacking, suggesting that a better theory is possible. We argue that a future deep learning theory should inherit three characteristics: a hierarchically structured network architecture, parameters iteratively optimized using stochastic gradient-based methods, and information from the data that evolves compressively. As an instantiation, we integrate these characteristics into a graphical model called neurashed. This model effectively explains some common empirical patterns in deep learning. In particular, neurashed enables insights into implicit regularization, information bottleneck, and local elasticity. Finally, we discuss how neurashed can guide the development of deep learning theories.

Journal Article

Share this book

Add to My Shelf

Statistical inference for the population landscape via moment-adjusted stochastic gradients

by Su, Weijie J. , Liang, Tengyuan in Acceleration , Asymptotic methods , Asymptotic properties

2019

Modern statistical inference tasks often require iterative optimization methods to compute the solution. Convergence analysis from an optimization viewpoint informs us only how well the solution is approximated numerically but overlooks the sampling nature of the data. In contrast, recognizing the randomness in the data, statisticians are keen to provide uncertainty quantification, or confidence, for the solution obtained by using iterative optimization methods. The paper makes progress along this direction by introducing moment-adjusted stochastic gradient descent: a new stochastic optimization method for statistical inference. We establish non-asymptotic theory that characterizes the statistical distribution for certain iterative methods with optimization guarantees. On the statistical front, the theory allows for model mis-specification, with very mild conditions on the data. For optimization, the theory is flexible for both convex and non-convex cases. Remarkably, the moment adjusting idea motivated from ‘error standardization’ in statistics achieves a similar effect to acceleration in first-order optimization methods that are used to fit generalized linear models. We also demonstrate this acceleration effect in the non-convex setting through numerical experiments.

Journal Article

Share this book

Add to My Shelf

GPNMB Modulates Autophagy to Enhance Functional Recovery After Spinal Cord Injury in Rats

by Yang, Lixuan , Lin, Xunxun , Chen, Xiangkun in Apoptosis , Autophagy , Cell injury

2024

Spinal cord injury (SCI) severely affects the quality of life and autonomy of patients, and effective treatments are currently lacking. Autophagy, an essential cellular metabolic process, plays a crucial role in neuroprotection and repair after SCI. Glycoprotein non-metastatic melanoma protein B (GPNMB) has been shown to promote neural regeneration and synapse reconstruction, potentially through the facilitation of autophagy. However, the specific role of GPNMB in autophagy after SCI is still unclear. In this study, we utilized the spinal cord transection method to establish SCI rats model and overexpressed GPNMB using adenoviral vectors. We assessed tissue damage using hematoxylin and eosin (H&E) and Nissl staining, and observed cell apoptosis using TUNEL staining. We evaluated the inflammatory response by measuring inflammatory factors using enzyme-linked immunosorbent assay (ELISA). In addition, we measured reactive oxygen species (ROS) levels using 2′,7′-dichlorodihydrofluorescein diacetate (DCFH-DA), and assessed oxidative stress levels by measuring malondialdehyde (MDA) and glutathione (GSH) using ELISA. To evaluate autophagy levels, we performed immunofluorescence staining for the autophagy marker Beclin-1 and conducted Western blot analysis for autophagy-related proteins. We also assessed limb recovery through functional evaluation. Meanwhile, we induced cell injury using lipopolysaccharide (LPS) and added an autophagy inhibitor to verify the impact of GPNMB on SCI through autophagy modulation. The results demonstrated that GPNMB alleviated the inflammatory response, reduced oxidative stress levels, inhibited cell apoptosis, and promoted autophagy following SCI. Inhibiting autophagy reversed the effects of GPNMB. These findings suggest that GPNMB promotes neural injury repair after SCI, potentially through attenuating the inflammatory response, reducing oxidative stress, and inhibiting cell apoptosis.

Journal Article

Share this book

Add to My Shelf

False discoveries occur early on the Lasso path

by Su, Weijie , Bogdan, Małgorzata , Candès, Emmanuel

2017

Journal Article

Share this book

Add to My Shelf

Association between transcription factors expression and growth patterns of nonfunctioning pituitary adenomas

by Yang, Lixuan , Li, Xixi , Zhang, Shaolin in 692/163 , 692/420 , Adenoma

2025

Transcription factors (TFs), including steroidogenic factor-1 (SF-1), T-box transcription factor (TPIT) and pituitary transcription factor–1 (PIT-1), play a pivotal role in the cytodifferentiation of adenohypophysis. However, the impact of TFs on the growth patterns of nonfunctioning pituitary adenomas (NFPAs) remains unclear. This study aims to investigate the correlation between the expression of TFs and NFPAs growth patterns. Preoperative MRI in 171 patients who underwent surgery for nonfunctioning pituitary macroadenomas were analyzed to determine tumor growth patterns. Immunohistochemical staining for transcription factors PIT-1, TPIT, and SF-1 was done on all samples. Extrasellar growth was divided into three principal directions: infrasellar, suprasellar and lateral cavernous sinus invasion (CSI). Suprasellar extension was defined as tumor extension superior to the tuberculum sellae-dorsum sellae line, inferior extension as invasion through the sellar floor into the sphenoid sinus or clivus and CSI as Knosp grading score of 3 ~ 4. Statistical analysis to compare the groups was conducted using the Fisher’s exact test and t-test. TPIT-expressing tumors were more likely to exhibit combined infrasellar extension (55.17 vs 17.70%, p < 0.0001), as well as isolated infrasellar extension (18.97 vs 0%, p < 0.0001) compared to SF-1-expressing tumors. Conversely, SF-1-expressing tumors were more likely to exhibit combined suprasellar extension (92.92 vs 77.59%, p = 0.0061), as well as isolated suprasellar extension (75.22 vs 41.38%, p < 0.0001). TPIT-expressing tumors had a significantly higher CSI invasion (55.17 vs 35.40%, p = 0.0148). The mean maximal tumor diameter in TPIT and SF-1 macroadenomas was similar (28 vs 26 mm, p = 0.1213). The expression of TFs affects the extrasellar growth pattern of NFPAs. TPIT tumors exhibit a higher propensity for bone invasion and CSI, while SF-1 tumors tend to extend into the suprasellar region. Isolated infrasellar extension is specific to TPIT tumors and can serve as a radiologic sign to distinguish between TPIT tumors and SF-1 tumors.

Journal Article

Share this book

Add to My Shelf

Association of circulating saturated fatty acids with the risk of pregnancy-induced hypertension: a nested case–control study

by Xia, Wei , Zhou Aifen , Huo Xia in Blood pressure , Cardiovascular disease , Chromatography

2020

Circulating saturated fatty acids (SFAs) have been associated with cardiovascular disease. However, little is known about the relationship of SFAs with the risk of pregnancy-induced hypertension (PIH). We conducted a nested case–control study to examine the associations between circulating SFAs and the risk of PIH. A total of 92 PIH cases were matched to 184 controls by age (±2 years) and infant sex from a birth cohort study conducted in Wuhan, China. Levels of circulating fatty acids in plasma were measured using gas chromatography–mass spectrometry. Conditional logistic regressions were conducted to calculate odds ratios (ORs) and 95% confidence intervals (95% CIs). Even-chain SFAs, including myristic acid (14:0) and palmitic acid (16:0), were positively associated with the risk of PIH [ORs (95% CIs): 2.92 (1.27, 6.74) for 14:0 and 2.85 (1.18, 6.89) for 16:0, % by wt]. In contrast, higher levels of very-long-chain SFAs, including arachidic acid (20:0), behenic acid (22:0), and lignoceric acid (24:0), were associated with a lower risk of PIH [ORs (95% CIs): 0.40 (0.17, 0.92) for 20:0, 0.30 (0.12, 0.71) for 22:0 and 0.26 (0.11, 0.64) for 24:0, μg/mL]. For odd-chain SFAs, including pentadecanoic acid (15:0) and heptadecanoic acid (17:0), no significant difference was observed. Our results provided convincing evidence that different subclasses of SFAs showed diverse effects on the risk of PIH. This suggests that dietary very-long-chain SFAs may be a novel means by which to prevent hypertension. Future studies are required to confirm these associations and elucidate the underlying mechanisms.

Journal Article

Share this book

Add to My Shelf

Image Inpainting with Fractional Laplacian Regularization: An Lp Norm Approach

by Lian, Xiangkai , Yao, Zheng-An , Hu, Dewen in Analysis , Differential equations , Fourier transforms

2025

This study presents an image inpainting model based on an energy functional that incorporates the Lp norm of the fractional Laplacian operator as a regularization term and the H−1 norm as a fidelity term. Using the properties of the fractional Laplacian operator, the Lp norm is employed with an adjustable parameter p to enhance the operator’s ability to restore fine details in various types of images. The replacement of the conventional L2 norm with the H−1 norm enables better preservation of global structures in denoising and restoration tasks. This paper introduces a diffusion partial differential equation by adding an intermediate term and provides a theoretical proof of the existence and uniqueness of its solution in Sobolev spaces. Furthermore, it demonstrates that the solution converges to the minimizer of the energy functional as time approaches infinity. Numerical experiments that compare the proposed method with traditional and deep learning models validate its effectiveness in image inpainting tasks.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter