MbrlCatalogueTitleDetail

Do you wish to reserve the book?
Comparison of different feature extraction methods for applicable automated ICD coding
Comparison of different feature extraction methods for applicable automated ICD coding
Hey, we have placed the reservation for you!
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Comparison of different feature extraction methods for applicable automated ICD coding
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Title added to your shelf!
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Comparison of different feature extraction methods for applicable automated ICD coding
Comparison of different feature extraction methods for applicable automated ICD coding

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
How would you like to get it?
We have requested the book for you! Sorry the robot delivery is not available at the moment
We have requested the book for you!
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Comparison of different feature extraction methods for applicable automated ICD coding
Comparison of different feature extraction methods for applicable automated ICD coding
Journal Article

Comparison of different feature extraction methods for applicable automated ICD coding

2022
Request Book From Autostore and Choose the Collection Method
Overview
Background Automated ICD coding on medical texts via machine learning has been a hot topic. Related studies from medical field heavily relies on conventional bag-of-words (BoW) as the feature extraction method, and do not commonly use more complicated methods, such as word2vec ( W2V ) and large pretrained models like BERT . This study aimed at uncovering the most effective feature extraction methods for coding models by comparing BoW , W2V and BERT variants. Methods We experimented with a Chinese dataset from Fuwai Hospital, which contains 6947 records and 1532 unique ICD codes, and a public Spanish dataset, which contains 1000 records and 2557 unique ICD codes. We designed coding tasks with different code frequency thresholds (denoted as f s ), with a lower threshold indicating a more complex task. Using traditional classifiers, we compared BoW , W2V and BERT variants on accomplishing these coding tasks. Results When f s was equal to or greater than 140 for Fuwai dataset, and 60 for the Spanish dataset, the BERT variants with the whole network fine-tuned was the best method, leading to a Micro-F 1 of 93.9% for Fuwai data when f s = 200 , and a Micro-F 1 of 85.41% for the Spanish dataset when f s = 180 . When f s fell below 140 for Fuwai dataset, and 60 for the Spanish dataset, BoW turned out to be the best, leading to a Micro-F 1 of 83% for Fuwai dataset when f s = 20 , and a Micro-F 1 of 39.1% for the Spanish dataset when f s = 20 . Our experiments also showed that both the BERT variants and BoW possessed good interpretability, which is important for medical applications of coding models. Conclusions This study shed light on building promising machine learning models for automated ICD coding by revealing the most effective feature extraction methods. Concretely, our results indicated that fine-tuning the whole network of the BERT variants was the optimal method for tasks covering only frequent codes, especially codes that represented unspecified diseases, while BoW was the best for tasks involving both frequent and infrequent codes. The frequency threshold where the best-performing method varied differed between different datasets due to factors like language and codeset.