Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Reading Level
      Reading Level
      Clear All
      Reading Level
  • Content Type
      Content Type
      Clear All
      Content Type
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Item Type
    • Is Full-Text Available
    • Subject
    • Publisher
    • Source
    • Donor
    • Language
    • Place of Publication
    • Contributors
    • Location
315 result(s) for "Online social networks Research Data processing."
Sort by:
Subjective well-being and social media
\"Subjective Well-Being and Social Media shows how, by exploiting the unprecedented amount of information provided by the social networking sites, it is possible to build new composite indicators of subjective well-being. These new social media indicators are complementary to official statistics and surveys, whose data are collected at very low temporary and geographical resolution. The book also explains in full details how to solve the problem of selection bias coming from social media data. Mixing textual analysis, machine learning and time series analysis, the book also shows how to extract both the structural and the temporary components of subjective well-being. Cross-country analysis confirms that well-being is a complex phenomenon that is governed by macroeconomic and health factors, ageing, temporary shocks and cultural and psychological aspects. As an example, the last part of the book focuses on the impact of the prolonged stress due to the COVID-19 pandemic on subjective well-being in both Japan and Italy. Through a data science approach, the results show that a consistent and persistent drop occurred throughout 2020 in the overall level of well-being in both countries. The methodology presented in this book: enables social scientists and policy makers to know what people think about the quality of their own life, minimizing the bias induced by the interaction between the researcher and the observed individuals; being language-free, it allows for comparing the well-being perceived in different linguistic and socio-cultural contexts, disentangling differences due to objective events and life conditions from dissimilarities related to social norms or language specificities; provides a solution to the problem of selection bias in social media data through a systematic approach based on time-space small area estimation models. The book comes also with replication R scripts and data. Stefano M. Iacus is full professor of Statistics at the University of Milan, on leave at the Joint Research Centre of the European Commission. Former R-core member (1999-2017) and R Foundation Member. Giuseppe Porro is full professor of Economic Policy at the University of Insubria. An earlier version of this project was awarded the Italian Institute of Statistics-Google prize for \"official statistics and big data\"\"-- Provided by publisher.
A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis
The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F 1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.
Advertising Content and Consumer Engagement on Social Media: Evidence from Facebook
We describe the effect of social media advertising content on customer engagement using data from Facebook. We content-code 106,316 Facebook messages across 782 companies, using a combination of Amazon Mechanical Turk and natural language processing algorithms. We use this data set to study the association of various kinds of social media marketing content with user engagement—defined as Likes , comments, shares, and click-throughs—with the messages. We find that inclusion of widely used content related to brand personality—like humor and emotion—is associated with higher levels of consumer engagement ( Likes , comments, shares) with a message. We find that directly informative content—like mentions of price and deals—is associated with lower levels of engagement when included in messages in isolation, but higher engagement levels when provided in combination with brand personality–related attributes. Also, certain directly informative content, such as deals and promotions, drive consumers’ path to conversion (click-throughs). These results persist after incorporating corrections for the nonrandom targeting of Facebook’s EdgeRank (News Feed) algorithm and so reflect more closely user reaction to content than Facebook’s behavioral targeting. Our results suggest that there are benefits to content engineering that combines informative characteristics that help in obtaining immediate leads (via improved click-throughs) with brand personality–related content that helps in maintaining future reach and branding on the social media site (via improved engagement). These results inform content design strategies. Separately, the methodology we apply to content-code text is useful for future studies utilizing unstructured data such as advertising content or product reviews. The online appendix is available at https://doi.org/10.1287/mnsc.2017.2902 . This paper was accepted by Chris Forman, information systems.
Word2vec convolutional neural networks for classification of news articles and tweets
Big web data from sources including online news and Twitter are good resources for investigating deep learning. However, collected news articles and tweets almost certainly contain data unnecessary for learning, and this disturbs accurate learning. This paper explores the performance of word2vec Convolutional Neural Networks (CNNs) to classify news articles and tweets into related and unrelated ones. Using two word embedding algorithms of word2vec, Continuous Bag-of-Word (CBOW) and Skip-gram, we constructed CNN with the CBOW model and CNN with the Skip-gram model. We measured the classification accuracy of CNN with CBOW, CNN with Skip-gram, and CNN without word2vec models for real news articles and tweets. The experimental results indicated that word2vec significantly improved the accuracy of the classification model. The accuracy of the CBOW model was higher and more stable when compared to that of the Skip-gram model. The CBOW model exhibited better performance on news articles, and the Skip-gram model exhibited better performance on tweets. Specifically, CNN with word2vec models was more effective on news articles when compared to that on tweets because news articles are typically more uniform when compared to tweets.
Social influence and political mobilization: Further evidence from a randomized experiment in the 2012 U.S. presidential election
A large-scale experiment during the 2010 U.S. Congressional Election demonstrated a positive effect of an online get-out-the-vote message on real world voting behavior. Here, we report results from a replication of the experiment conducted during the U.S. Presidential Election in 2012. In spite of the fact that get-out-the-vote messages typically yield smaller effects during high-stakes elections due to saturation of mobilization efforts from many sources, a significant increase in voting was again observed. Voting also increased significantly among the close friends of those who received the message to go to the polls, and the total effect on the friends was likely larger than the direct effect, suggesting that understanding social influence effects is potentially even more important than understanding the direct effects of messaging. These results replicate earlier work and they add to growing evidence that online social networks can be instrumental for spreading offline behaviors.
Methods for Analyzing the Contents of Social Media for Health Care: Scoping Review
Given the rapid development of social media, effective extraction and analysis of the contents of social media for health care have attracted widespread attention from health care providers. As far as we know, most of the reviews focus on the application of social media, and there is a lack of reviews that integrate the methods for analyzing social media information for health care. This scoping review aims to answer the following 4 questions: (1) What types of research have been used to investigate social media for health care, (2) what methods have been used to analyze the existing health information on social media, (3) what indicators should be applied to collect and evaluate the characteristics of methods for analyzing the contents of social media for health care, and (4) what are the current problems and development directions of methods used to analyze the contents of social media for health care? A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines was conducted. We searched PubMed, the Web of Science, EMBASE, the Cumulative Index to Nursing and Allied Health Literature, and the Cochrane Library for the period from 2010 to May 2023 for primary studies focusing on social media and health care. Two independent reviewers screened eligible studies against inclusion criteria. A narrative synthesis of the included studies was conducted. Of 16,161 identified citations, 134 (0.8%) studies were included in this review. These included 67 (50.0%) qualitative designs, 43 (32.1%) quantitative designs, and 24 (17.9%) mixed methods designs. The applied research methods were classified based on the following aspects: (1) manual analysis methods (content analysis methodology, grounded theory, ethnography, classification analysis, thematic analysis, and scoring tables) and computer-aided analysis methods (latent Dirichlet allocation, support vector machine, probabilistic clustering, image analysis, topic modeling, sentiment analysis, and other natural language processing technologies), (2) categories of research contents, and (3) health care areas (health practice, health services, and health education). Based on an extensive literature review, we investigated the methods for analyzing the contents of social media for health care to determine the main applications, differences, trends, and existing problems. We also discussed the implications for the future. Traditional content analysis is still the mainstream method for analyzing social media content, and future research may be combined with big data research. With the progress of computers, mobile phones, smartwatches, and other smart devices, social media information sources will become more diversified. Future research can combine new sources, such as pictures, videos, and physiological signals, with online social networking to adapt to the development trend of the internet. More medical information talents need to be trained in the future to better solve the problem of network information analysis. Overall, this scoping review can be useful for a large audience that includes researchers entering the field.
Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts
With the rapid proliferation of social networking sites (SNS), automatic topic extraction from various text messages posted on SNS are becoming an important source of information for understanding current social trends or needs. Latent Dirichlet Allocation (LDA), a probabilistic generative model, is one of the popular topic models in the area of Natural Language Processing (NLP) and has been widely used in information retrieval, topic extraction, and document analysis. Unlike long texts from formal documents, messages on SNS are generally short. Traditional topic models such as LDA or pLSA (probabilistic latent semantic analysis) suffer performance degradation for short-text analysis due to a lack of word co-occurrence information in each short text. To cope with this problem, various techniques are evolving for interpretable topic modeling for short texts, pretrained word embedding with an external corpus combined with topic models is one of them. Due to recent developments of deep neural networks (DNN) and deep generative models, neural-topic models (NTM) are emerging to achieve flexibility and high performance in topic modeling. However, there are very few research works on neural-topic models with pretrained word embedding for generating high-quality topics from short texts. In this work, in addition to pretrained word embedding, a fine-tuning stage with an original corpus is proposed for training neural-topic models in order to generate semantically coherent, corpus-specific topics. An extensive study with eight neural-topic models has been completed to check the effectiveness of additional fine-tuning and pretrained word embedding in generating interpretable topics by simulation experiments with several benchmark datasets. The extracted topics are evaluated by different metrics of topic coherence and topic diversity. We have also studied the performance of the models in classification and clustering tasks. Our study concludes that though auxiliary word embedding with a large external corpus improves the topic coherency of short texts, an additional fine-tuning stage is needed for generating more corpus-specific topics from short-text data.
Reproducible molecular networking of untargeted mass spectrometry data using GNPS
Global Natural Product Social Molecular Networking (GNPS) is an interactive online small molecule–focused tandem mass spectrometry (MS 2 ) data curation and analysis infrastructure. It is intended to provide as much chemical insight as possible into an untargeted MS 2 dataset and to connect this chemical insight to the user’s underlying biological questions. This can be performed within one liquid chromatography (LC)-MS 2 experiment or at the repository scale. GNPS-MassIVE is a public data repository for untargeted MS 2 data with sample information (metadata) and annotated MS 2 spectra. These publicly accessible data can be annotated and updated with the GNPS infrastructure keeping a continuous record of all changes. This knowledge is disseminated across all public data; it is a living dataset. Molecular networking—one of the main analysis tools used within the GNPS platform—creates a structured data table that reflects the molecular diversity captured in tandem mass spectrometry experiments by computing the relationships of the MS 2 spectra as spectral similarity. This protocol provides step-by-step instructions for creating reproducible, high-quality molecular networks. For training purposes, the reader is led through a 90- to 120-min procedure that starts by recalling an example public dataset and its sample information and proceeds to creating and interpreting a molecular network. Each data analysis job can be shared or cloned to disseminate the knowledge gained, thus propagating information that can lead to the discovery of molecules, metabolic pathways, and ecosystem/community interactions. Global Natural Product Social Molecular Networking (GNPS) is an online tandem mass spectrometry (MS 2 ) data curation and analysis infrastructure. This protocol describes how to use GNPS to explore uploaded metabolomics data.
Forecasting influenza in Hong Kong with Google search queries and statistical model fusion
The objective of this study is to investigate predictive utility of online social media and web search queries, particularly, Google search data, to forecast new cases of influenza-like-illness (ILI) in general outpatient clinics (GOPC) in Hong Kong. To mitigate the impact of sensitivity to self-excitement (i.e., fickle media interest) and other artifacts of online social media data, in our approach we fuse multiple offline and online data sources. Four individual models: generalized linear model (GLM), least absolute shrinkage and selection operator (LASSO), autoregressive integrated moving average (ARIMA), and deep learning (DL) with Feedforward Neural Networks (FNN) are employed to forecast ILI-GOPC both one week and two weeks in advance. The covariates include Google search queries, meteorological data, and previously recorded offline ILI. To our knowledge, this is the first study that introduces deep learning methodology into surveillance of infectious diseases and investigates its predictive utility. Furthermore, to exploit the strength from each individual forecasting models, we use statistical model fusion, using Bayesian model averaging (BMA), which allows a systematic integration of multiple forecast scenarios. For each model, an adaptive approach is used to capture the recent relationship between ILI and covariates. DL with FNN appears to deliver the most competitive predictive performance among the four considered individual models. Combing all four models in a comprehensive BMA framework allows to further improve such predictive evaluation metrics as root mean squared error (RMSE) and mean absolute predictive error (MAPE). Nevertheless, DL with FNN remains the preferred method for predicting locations of influenza peaks. The proposed approach can be viewed a feasible alternative to forecast ILI in Hong Kong or other countries where ILI has no constant seasonal trend and influenza data resources are limited. The proposed methodology is easily tractable and computationally efficient.
A Comparative Analysis on Suicidal Ideation Detection Using NLP, Machine, and Deep Learning
Social networks are essential resources to obtain information about people’s opinions and feelings towards various issues as they share their views with their friends and family. Suicidal ideation detection via online social network analysis has emerged as an essential research topic with significant difficulties in the fields of NLP and psychology in recent years. With the proper exploitation of the information in social media, the complicated early symptoms of suicidal ideations can be discovered and hence, it can save many lives. This study offers a comparative analysis of multiple machine learning and deep learning models to identify suicidal thoughts from the social media platform Twitter. The principal purpose of our research is to achieve better model performance than prior research works to recognize early indications with high accuracy and avoid suicide attempts. We applied text pre-processing and feature extraction approaches such as CountVectorizer and word embedding, and trained several machine learning and deep learning models for such a goal. Experiments were conducted on a dataset of 49,178 instances retrieved from live tweets by 18 suicidal and non-suicidal keywords using Python Tweepy API. Our experimental findings reveal that the RF model can achieve the highest classification score among machine learning algorithms, with an accuracy of 93% and an F1 score of 0.92. However, training the deep learning classifiers with word embedding increases the performance of ML models, where the BiLSTM model reaches an accuracy of 93.6% and a 0.93 F1 score.