Catalogue Search | MBRL

Engineering controllable biofilms for biotechnological applications

by Mukherjee, Manisha , Cao, Bin in Bacteria , Biochemical fuel cells , Biofilms

2021

Intercellular communications through quorum sensing (QS) signalling molecules play an important role in biofilm formation for many bacteria, where an intensified QS promotes the development of highly structured biofilms (Davies et al., 1998) while disrupting QS, or quorum quenching (QQ), reduced biofilm formation (Fetzner, 2015). [...]QS serves as one target for controlling biofilm formation. A decreased concentration of c-di-GMP reduces biofilm formation and a high level of c-di-GMP promotes the formation of biofilm (Hengge, 2009). Because c-di-GMP regulates biofilm formation and dispersal in a wide variety of bacteria, the implementation of c-di-GMP-targeted biofilm engineering is expected to be compatible with diverse bacterial hosts. ( 2019) overexpressed a cAMP synthase gene in Shewanella oneidensis, which greatly improves current generation by the electrochemically active biofilm in microbial fuel cells. [...]by tuning the intracellular concentration of cAMP, the cAMP-CRP regulatory system could be manipulated to achieve engineered biofilms with better performance. [...]an important factor influencing the performance of synthetic gene circuits is the interference from the host’s endogenous regulatory network. [...]it is desirable that the host itself should possess a low level of native signalling messengers/molecules to achieve an enhanced performance of the introduced gene circuit.

Journal Article

Share this book

Add to My Shelf

Biofilm development and enhanced stress resistance of a model, mixed-species community biofilm

by Mukherjee, Manisha , Kjelleberg, Staffan , Lee, Kai Wei Kelvin in 631/326/2565 , 631/326/46 , Anti-Infective Agents - pharmacology

2014

Most studies of biofilm biology have taken a reductionist approach, where single-species biofilms have been extensively investigated. However, biofilms in nature mostly comprise multiple species, where interspecies interactions can shape the development, structure and function of these communities differently from biofilm populations. Hence, a reproducible mixed-species biofilm comprising Pseudomonas aeruginosa , Pseudomonas protegens and Klebsiella pneumoniae was adapted to study how interspecies interactions affect biofilm development, structure and stress responses. Each species was fluorescently tagged to determine its abundance and spatial localization within the biofilm. The mixed-species biofilm exhibited distinct structures that were not observed in comparable single-species biofilms. In addition, development of the mixed-species biofilm was delayed 1–2 days compared with the single-species biofilms. Composition and spatial organization of the mixed-species biofilm also changed along the flow cell channel, where nutrient conditions and growth rate of each species could have a part in community assembly. Intriguingly, the mixed-species biofilm was more resistant to the antimicrobials sodium dodecyl sulfate and tobramycin than the single-species biofilms. Crucially, such community level resilience was found to be a protection offered by the resistant species to the whole community rather than selection for the resistant species. In contrast, community-level resilience was not observed for mixed-species planktonic cultures. These findings suggest that community-level interactions, such as sharing of public goods, are unique to the structured biofilm community, where the members are closely associated with each other.

Journal Article

Share this book

Add to My Shelf

$An integer-order SIS epidemic model having variable population and fear effect: comparing the stability with fractional order$

An integer-order SIS epidemic model having variable population and fear effect: comparing the stability with fractional order

by Mondal, Biswajit , Mukherjee, Manisha in Birth rate , Coronaviruses , Epidemics

2022

This paper investigates the dynamics of an integer-order and fractional-order SIS epidemic model with birth in both susceptible and infected populations, constant recruitment, and the effect of fear levels due to infectious diseases. The existence, uniqueness, non-negativity, and boundedness of the solutions for both proposed models have been discussed. We have established the existence of various equilibrium points and derived sufficient conditions that ensure the local stability under two cases in both integer- and fractional-order models. Global stability has been vindicated using Dulac–Bendixson criterion in the integer-order model. The forward transcritical bifurcation near the disease-free equilibrium has been investigated. The effect of fear level on infected density has also been observed. We have done numerical simulation by MATLAB to verify the theoretical results, found the impact of fear level on the dynamic behaviour of the infected population, and obtained a bifurcation diagram concerning the constant recruitment and fear level. Finally, we have compared the stability of the population in integer and fractional-order systems.

Journal Article

Share this book

Add to My Shelf

Interspecific diversity reduces and functionally substitutes for intraspecific variation in biofilm communities

by Kelvin Lee, Kai Wei , Hoong Yam, Joey Kuok , Periasamy, Saravanan in 14/19 , 45/23 , 45/77

2016

Diversity has a key role in the dynamics and resilience of communities and both interspecific (species) and intraspecific (genotypic) diversity can have important effects on community structure and function. However, a critical and unresolved question for understanding the ecology of a community is to what extent these two levels of diversity are functionally substitutable? Here we show, for a mixed-species biofilm community composed of Pseudomonas aeruginosa , P. protegens and Klebsiella pneumoniae, that increased interspecific diversity reduces and functionally substitutes for intraspecific diversity in mediating tolerance to stress. Biofilm populations generated high percentages of genotypic variants, which were largely absent in biofilm communities. Biofilms with either high intra- or interspecific diversity were more tolerant to SDS stress than biofilms with no or low diversity. Unexpectedly, genotypic variants decreased the tolerance of biofilm communities when experimentally introduced into the communities. For example, substituting P. protegens wild type with its genotypic variant within biofilm communities decreased SDS tolerance by twofold, apparently due to perturbation of interspecific interactions. A decrease in variant frequency was also observed when biofilm populations were exposed to cell-free effluents from another species, suggesting that extracellular factors have a role in selection against the appearance of intraspecific variants. This work demonstrates the functional substitution of inter- and intraspecific diversity for an emergent property of biofilms. It also provides a potential explanation for a long-standing paradox in microbiology, in which morphotypic variants are common in laboratory grown biofilm populations, but are rare in diverse, environmental biofilm communities.

Journal Article

Share this book

Add to My Shelf

Soil phyllosilicate and iron oxide inhibit the quorum sensing of Chromobacterium violaceum

by Qu, Chenchen , Mukherjee, Manisha , Wu, Yichao in Bacteria , Biofilms , Chemical synthesis

2021

Microorganisms respond to various adverse environmental conditions and regulate different physiological functions by secreting and sensing signal molecules through quorum sensing (QS) systems. Phyllosilicates and iron oxides present in soils and sediments may have substantial impact on bacterial activity and QS due to their unique reactivity and close association with microorganisms. This research explored the effect of goethite, montmorillonite and kaolinite (0.05-2 g L -1) on the growth and QS of a bacterial model, Chromobacterium violaceum. The results showed that kaolinite and goethite caused cellular damage at low mineral concentrations. The capacity for violacein production and biofilm formation of C. violaceum were inhibited by the minerals in the order of kaolinite>goethite>montmorillonite. The possible underlying mechanisms for QS inhibition by different minerals were investigated. Specifically, kaolinite repressed QS function through downregulation the expression of signal molecules synthesis gene cviI. Goethite and montmorillonite interfered with QS by adsorption of extracellular signal molecules. This work provides a better un derstanding of the interactions between bacteria and minerals and proposed that the inhibition of QS system is an ignored mechanism for bacterial toxicity by phyllosilicates and iron oxides.

Journal Article

Share this book

Add to My Shelf

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

by Mukherjee, Manisha , Hellendoorn, Vincent J in Datasets , Inference , Knowledge bases (artificial intelligence)

2026

Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in security reasoning and brittleness to evolving vulnerability patterns raise critical trustworthiness concerns. Models trained on static datasets cannot readily adapt to newly discovered vulnerabilities or changing security standards without retraining, leading to the repeated generation of unsafe code. We present a principled approach to trustworthy code generation by design that operates as an inference-time safety mechanism. Our approach employs retrieval-augmented generation to surface relevant security risks in generated code and retrieve related security discussions from a curated Stack Overflow knowledge base, which are then used to guide an LLM during code revision. This design emphasizes three aspects relevant to trustworthiness: (1) interpretability, through transparent safety interventions grounded in expert community explanations; (2) robustness, by allowing adaptation to evolving security practices without model retraining; and (3) safety alignment, through real-time intervention before unsafe code reaches deployment. Across real-world and benchmark datasets, our approach improves the security of LLM-generated code compared to prompting alone, while introducing no new vulnerabilities as measured by static analysis. These results suggest that principled, retrieval-augmented inference-time interventions can serve as a complementary mechanism for improving the safety of LLM-based code generation, and highlight the ongoing value of community knowledge in supporting trustworthy AI deployment.

Paper

Share this book

Add to My Shelf

SOSecure: Safer Code Generation with RAG and StackOverflow Discussions

by Mukherjee, Manisha , Hellendoorn, Vincent J in Large language models , Retrieval augmented generation , Security

2026

Large Language Models (LLMs) are widely used for automated code generation. Their reliance on infrequently updated pretraining data leaves them unaware of newly discovered vulnerabilities and evolving security standards, making them prone to producing insecure code. In contrast, developer communities on Stack Overflow (SO) provide an ever-evolving repository of knowledge, where security vulnerabilities are actively discussed and addressed through collective expertise. These community-driven insights remain largely untapped by LLMs. This paper introduces SOSecure, a Retrieval-Augmented Generation (RAG) system that leverages the collective security expertise found in SO discussions to improve the security of LLM-generated code. We build a security-focused knowledge base by extracting SO answers and comments that explicitly identify vulnerabilities. Unlike common uses of RAG, SOSecure triggers after code has been generated to find discussions that identify flaws in similar code. These are used in a prompt to an LLM to consider revising the code. Evaluation across three datasets (SALLM, LLMSecEval, and LMSys) show that SOSecure achieves strong fix rates of 71.7%, 91.3%, and 96.7% respectively, compared to prompting GPT-4 without relevant discussions (49.1%, 56.5%, and 37.5%), and outperforms multiple other baselines. SOSecure operates as a language-agnostic complement to existing LLMs, without requiring retraining or fine-tuning, making it easy to deploy. Our results underscore the importance of maintaining active developer forums, which have dropped substantially in usage with LLM adoptions.

Paper

Share this book

Add to My Shelf

SOSecure: Safer Code Generation with RAG and StackOverflow Discussions

by Mukherjee, Manisha , Hellendoorn, Vincent J in Large language models , Security

2025

Large Language Models (LLMs) are widely used for automated code generation. Their reliance on infrequently updated pretraining data leaves them unaware of newly discovered vulnerabilities and evolving security standards, making them prone to producing insecure code. In contrast, developer communities on Stack Overflow (SO) provide an ever-evolving repository of knowledge, where security vulnerabilities are actively discussed and addressed through collective expertise. These community-driven insights remain largely untapped by LLMs. This paper introduces SOSecure, a Retrieval-Augmented Generation (RAG) system that leverages the collective security expertise found in SO discussions to improve the security of LLM-generated code. We build a security-focused knowledge base by extracting SO answers and comments that explicitly identify vulnerabilities. Unlike common uses of RAG, SOSecure triggers after code has been generated to find discussions that identify flaws in similar code. These are used in a prompt to an LLM to consider revising the code. Evaluation across three datasets (SALLM, LLMSecEval, and LMSys) show that SOSecure achieves strong fix rates of 71.7%, 91.3%, and 96.7% respectively, compared to prompting GPT-4 without relevant discussions (49.1%, 56.5%, and 37.5%), and outperforms multiple other baselines. SOSecure operates as a language-agnostic complement to existing LLMs, without requiring retraining or fine-tuning, making it easy to deploy. Our results underscore the importance of maintaining active developer forums, which have dropped substantially in usage with LLM adoptions.

Paper

Share this book

Add to My Shelf

Skill over Scale: The Case for Medium, Domain-Specific Models for SE

by Mukherjee, Manisha , Hellendoorn, Vincent J in Artificial intelligence , Labeling , Large language models

2025

Recent advancements in AI have sparked a trend in constructing large, generalist language models that handle a multitude of tasks, including many code-related ones. While these models are expensive to train and are often closed-source, they have enjoyed broad adoption because they tend to outperform smaller, domain-specific models of code. In this work, we argue that this is not a foregone conclusion. We show that modestly sized domain-specific models can outperform much larger ones on code labeling tasks, provided they are trained to the same standards. Concretely, we focus on StackOverflow (SO), which offers large volumes of aligned code and text data. We align established best-practices for pre-training large language models with properties of SO as a data source, especially using a large context window (2,048 tokens), coupled with a powerful toolkit (Megatron-LM) to train two models: SOBertBase (125M parameters) and SOBertLarge (762M parameters), at a budget of just \$374 and \$1600 each. We compare the performance of our models with a prior domain-specific model which did not adopt many of these practices (BERTOverflow), as well two general-purpose BERT models and two models in OpenAI's GPT series (GPT-3.5 and GPT-4). We study four labeling tasks: question quality prediction, closed question prediction, NER and obsoletion prediction. The final task is a new benchmark we introduce, on which we additionally compare SOBert with a fine-tuned CodeLlama and StackLlama (models with 10x more parameters than SOBertLarge). Our models consistently outperform all baselines. In contrast, BertOverflow is outperformed by generalist models in most tasks. These results demonstrate that pre-training both extensively and properly on in-domain data can yield a powerful and affordable alternative to leveraging closed-source general-purpose models. Both models are released to the public on Hugging Face.

Paper

Share this book

Add to My Shelf

$\Medium\ LMs of Code in the Era of LLMs: Lessons From StackOverflow$

\Medium\ LMs of Code in the Era of LLMs: Lessons From StackOverflow

by Mukherjee, Manisha , Hellendoorn, Vincent J in Large language models , Mathematical models , Natural language processing

2024

Large pre-trained neural language models have brought immense progress to both NLP and software engineering. Models in OpenAI's GPT series now dwarf Google's BERT and Meta's RoBERTa, which previously set new benchmarks on a wide range of NLP applications. These models are trained on massive corpora of heterogeneous data from web crawls, which enables them to learn general language patterns and semantic relationships. However, the largest models are both expensive to train and deploy and are often closed-source, so we lack access to their data and design decisions. We argue that this trend towards large, general-purpose models should be complemented with single-purpose, more modestly sized pre-trained models. In this work, we take StackOverflow (SO) as a domain example in which large volumes of rich aligned code and text data is available. We adopt standard practices for pre-training large language models, including using a very large context size (2,048 tokens), batch size (0.5M tokens) and training set (27B tokens), coupled with a powerful toolkit (Megatron-LM), to train two models: SOBertBase, with 109M parameters, and SOBertLarge with 762M parameters, at a budget of just $\\$187\$ and \$\\$800$ each. We compare the performance of our models with both the previous SOTA model trained on SO data exclusively as well general-purpose BERT models and OpenAI's ChatGPT on four SO-specific downstream tasks - question quality prediction, closed question prediction, named entity recognition and obsoletion prediction (a new task we introduce). Not only do our models consistently outperform all baselines, the smaller model is often sufficient for strong results. Both models are released to the public. These results demonstrate that pre-training both extensively and properly on in-domain data can yield a powerful and affordable alternative to leveraging closed-source general-purpose models.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter