How-to-use-Transformers - NLP Applications Using the Transformers Library from Hugging Face

Introduction to the How-to-use-Transformers Project

Transformers

The How-to-use-Transformers project is an essential resource for anyone interested in natural language processing (NLP) using the Transformers library developed by Hugging Face. This Python library supports many pre-trained language models, including popular ones like BERT and GPT, making it a favorite among developers building NLP applications.

The project serves as a code repository for the tutorial "Quick Start with the Transformers Library". It is organized into two main sections:

Data: Contains the datasets used throughout the tutorials.
Source: Houses example code, organized by task in individual folders, allowing users to download and use them independently.

Quick Start with the Transformers Library

The tutorial is structured into four major parts:

Part One: Background Knowledge

Natural Language Processing: Introduction to NLP and its applications. Learn more.
Transformer Models: An overview of transformer models and their significance. Learn more.
Attention Mechanism: Understanding the pivotal attention mechanism in transformers. Learn more.

Part Two: Getting to Know Transformers

Pipelines: Explore pre-built models for easy use. Learn more.
Models and Tokenizers: Dive into the components of the library. Learn more.
Basic PyTorch Knowledge: Necessary knowledge for using transformers with PyTorch. Learn more.
Fine-tuning Pre-trained Models: Learn to adapt models for specific tasks. Learn more.

Part Three: Practical Applications

Fast Tokenizers: Learn about efficient tokenization. Learn more.
Sequence Labeling Tasks: Implement tasks like named entity recognition. Learn more.
Translation Tasks: Set up models for language translation. Learn more.
Text Summarization Tasks: Create concise summaries of texts. Learn more.
Extractive QA: Building systems for question answering from texts. Learn more.
Prompting for Sentiment Analysis: Use prompts for sentiment analysis. Learn more.

Part Four: The Era of Large Language Models

Introduction to Large Language Models (LLMs): Overview of LLMs. Learn more.
Pre-training LLMs: Insight into the pre-training process. Learn more.
Using LLMs: Strategies for employing LLMs effectively. Learn more.
Instruction tuning for FlanT5 model
Instruction tuning for Llama2 model

Sample Code

The repository includes several example projects:

Pairwise Classification AFQMC: A sentence pair classification task focused on financial synonym judgment. Explore the code.
Sequence Labeling NER CPD: A task for named entity recognition. Explore the code.
Seq2Seq Translation: Handles Chinese-English translation tasks. Explore the code.
Seq2Seq Summarization: Extracts summaries from texts. Explore the code.
Extractive QA CMRC: A project for building extractive question answering systems. Explore the code.
Text Classification Prompt Sentiment CHNSentiCorp: Focuses on sentiment analysis using prompts. Explore the code.

Important Updates

2024-07-06: Improved the text presentation in Chapter 1 on Natural Language Processing, added visuals, and included an introduction to large language models.
2024-07-27: Completed the initial draft of chapters 14 to 16 covering the introduction and use of large language models.