Introduction to KerasHub
KerasHub, formerly known as KerasNLP, is a modern and dynamic library designed to cater to the diverse needs of today's machine learning practitioners. It supports a wide array of fields, including natural language processing (NLP), computer vision, audio, and multimodal tasks. One of its standout features is its compatibility with multiple frameworks, such as TensorFlow, JAX, and PyTorch, allowing seamless integration and flexibility in model deployment and development.
What Makes KerasHub Unique?
KerasHub distinguishes itself by providing an extensive repository of pre-trained models, alongside a collection of foundational building blocks tailored for various machine learning tasks. Built on the robust Keras 3 platform, models in KerasHub can be easily trained and saved in one framework and then reused in another, eliminating the hassle of extensive migrations.
Core Features
- Multi-Framework Support: The library ensures that from a single model definition, one can work seamlessly across JAX, TensorFlow, and PyTorch. This capability is enhanced by its efficiency on both GPUs and TPUs.
- Scalable Fine-tuning: Models can be fine-tuned at scale, utilizing model and data parallel training, making them adaptable to different hardware accelerators.
- User-Friendly Architecture: As an extension of the core Keras API, KerasHub's architecture of Layers and Models makes it very accessible for those already familiar with Keras, facilitating a smooth learning curve.
Getting Started with KerasHub
For newcomers eager to dive in, KerasHub offers a simple yet powerful Python-based API. For example, one can fine-tune a BERT classifier using IMDb movie reviews. Here's a brief overview of how it's done:
import os
os.environ["KERAS_BACKEND"] = "jax" # Or "tensorflow" or "torch"!
import keras_hub
import tensorflow_datasets as tfds
imdb_train, imdb_test = tfds.load(
"imdb_reviews",
split=["train", "test"],
as_supervised=True,
batch_size=16,
)
# Load a BERT model.
classifier = keras_hub.models.Classifier.from_preset(
"bert_base_en",
num_classes=2,
activation="softmax",
)
# Fine-tune on IMDb movie reviews.
classifier.fit(imdb_train, validation_data=imdb_test)
# Predict two new examples.
classifier.predict(["What an amazing movie!", "A total waste of my time."])
For more detailed guides and examples, the KerasHub documentation and website provide a comprehensive resource.
Installation and Backend Configuration
To explore the latest features of KerasHub, users can easily install the library using:
pip install keras-hub
KerasHub is designed to be used alongside TensorFlow for data preprocessing via the tf.data
API, though training can be conducted on any backend. Users are encouraged to set their preferred backend (JAX, TensorFlow, or PyTorch) through the KERAS_BACKEND
environment variable before importing any Keras libraries to ensure proper setup.
Compatibility and Contribution
KerasHub follows Semantic Versioning principles, aiming to offer backward compatibility for both code and saved models. While currently under active development, with the 0.y.z
pre-release phase, there may be occasional changes that could affect compatibility.
The project thrives on community contributions, welcoming ideas and improvements from developers worldwide. Detailed guides are available for those interested in contributing to the advancement of this robust library.
Acknowledgements
KerasHub owes its success to the numerous contributors who have dedicated their time and expertise to improve and expand its capabilities. The platform's vibrant community continues to grow, enhancing its value and application across different domains.
In summary, KerasHub is an adaptable and scalable library that empowers machine learning developers with its multi-framework compatibility and a rich repository of pre-trained models. It's an essential tool for achieving efficient and versatile solutions in NLP, computer vision, and beyond.