Project Icon

BERTweet

Pre-trained Language Model Optimized for English Tweets

Product DescriptionBERTweet is a pioneering language model pre-trained on a large scale for English Tweets using the RoBERTa method. It utilizes a comprehensive dataset of 850 million tweets, including 5 million related to COVID-19, to enhance performance in NLP tasks. The model can be used with frameworks like `transformers` and `fairseq`, offering pre-trained models such as `bertweet-base` and `bertweet-large`, suitable for deep learning applications. It features effective tweet normalization, facilitating refined text analysis and predictions, supporting research and practical usage.
Project Details