Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Transformer-Enhanced Text Classification in Cybersecurity: GPT-Augmented Synthetic Data Generation, BERT-Based Semantic Encoding, and Multiclass Analysis
by
Houston, Robert A
in
Artificial intelligence
/ Computer Engineering
/ Engineering
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Transformer-Enhanced Text Classification in Cybersecurity: GPT-Augmented Synthetic Data Generation, BERT-Based Semantic Encoding, and Multiclass Analysis
by
Houston, Robert A
in
Artificial intelligence
/ Computer Engineering
/ Engineering
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Transformer-Enhanced Text Classification in Cybersecurity: GPT-Augmented Synthetic Data Generation, BERT-Based Semantic Encoding, and Multiclass Analysis
Dissertation
Transformer-Enhanced Text Classification in Cybersecurity: GPT-Augmented Synthetic Data Generation, BERT-Based Semantic Encoding, and Multiclass Analysis
2024
Request Book From Autostore
and Choose the Collection Method
Overview
This research presents an examination and findings based on an investigation into the use of Large Language Models (LLMs) for generative and encoding tasks in the domain of cybersecurity. The work presented herein demonstrates some critical uses of transformer-derived deep learning model architectures that use Generative Pretrained Transformer Model 2 (GPT-2) to generate synthetic data to balance otherwise imbalanced data sets for multi-class classification for enhancing cybersecurity and families of Bidirectional Encoder Representations from Transformers (BERT) used for word embeddings. Three factors link these research areas. First, the BERT embeddings are derived from a synthetically balanced dataset produced by GPT-2 generative capabilities. Second, both BERT and GPT-2 models are pretrained models that are subsequently fine-tuned using each minority class of the data set. Finally, BERT and GPT-2 are derived from the transformers model introduced by the seminal paper, Attention Is All You Need (Vaswani et al., 2017). In this research, synthetic data sets and the embedding models produced using transformer models are evaluated using various traditional Machine Learning (ML) models using a novel weighted aggregation of F1 score developed to account for the disparate risk inherent with different classification classes for cybersecurity applications.Another aspect of this research centers on using publicly licensed models with open-source base-training weights. The objective here is to analyze lightweight LLMs that are both deployable in computationally constrained environments and available for use in secure environments with no need to pass data to public cloud enclaves via commercial API calls to monetized foundational models.This research addresses essential needs relevant to using AI models in cybersecurity applications. The primary goals address the need for balanced, high-quality training data sets and semantically aware Natural Language Processing (NLP) vectorization methods for ML models. As a backdrop, there is an emphasis on the use of lightweight democratized models that are not monetized and are widely publicly available with open-source model weights.
Publisher
ProQuest Dissertations & Theses
Subject
ISBN
9798381111774
This website uses cookies to ensure you get the best experience on our website.