dolly - Licensed AI Model for Instruction Adherence

Introducing the Dolly Project

Databricks has developed an exciting project called Dolly, a large language model designed to follow instructions effectively. Unlike many other AI models, Dolly is licensed for commercial use, which means businesses can leverage its capabilities without worrying about licensing barriers.

What is Dolly?

Dolly is built upon a model known as pythia-12b and has been trained on about 15,000 specific instruction-response pairs. These pairs come from a dataset known as databricks-dolly-15k. The instructions in this set cover several areas the AI can assist with, such as brainstorming, classification, question answering, and summarization. Although Dolly might not top the charts in terms of state-of-the-art performance, it stands out for its ability to closely follow instructions, yielding high-quality results from simple prompts.

Commitment to AI Accessibility

Databricks envisions a future where artificial intelligence's transformative power benefits everyone equally. Dolly is an important step in this direction, offering its instruction-following capabilities to a wide audience. The model is openly accessible on Hugging Face, a popular platform for AI models, under the name databricks/dolly-v2-12b.

Model Architecture and Training

Dolly is a causal language model with 12 billion parameters. It stems from EleutherAI’s Pythia-12b model but has been fine-tuned specifically for instruction-based tasks using the data provided by Databricks employees. This fine-tuning process has equipped Dolly with a unique ability to carry out instructions, differentiating it from its foundational model.

Limitations and Challenges

Performance Constraints

Dolly isn't the most advanced generative language model available today. While ongoing testing is being conducted, it doesn't yet perform on par with newer models developed from larger datasets or with more modern architecture. Among its limitations, Dolly might struggle with complex sentence structures, programming tasks, math problems, factual inconsistencies, and humor.

Dataset Constraints

Every language model reflects the data it is trained on, and Dolly is no exception. Its training set, derived from Internet sources and created by Databricks employees, might mirror the biases contained in those sources. For instance, Wikipedia, part of the dataset, may introduce specific biases or errors. Therefore, users should be cautious about possible inaccuracies or unintended biases in Dolly's responses.

Using Dolly for Inference and Training

Users interested in trying out Dolly can easily do so by accessing the model on platforms that support Hugging Face models. Here’s how it looks when using it with the transformers library:

from transformers import pipeline
import torch

instruct_pipeline = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
result = instruct_pipeline("Explain the difference between nuclear fission and fusion.")

Deploying in Various Environments

Dolly can be run on different types of GPUs, though the most efficient setups are graphed towards A100 GPUs due to their power. However, adjustments allow it to work on other GPU types, such as A10 and V100, albeit with some performance trade-offs due to memory constraints and model size.

Conclusion

Databricks' Dolly project is an impressive step forward in making AI technology widely accessible and commercially viable. Despite its current limitations, it represents a commitment to expanding AI’s reach and potential to help organizations and individuals alike. By continuing to refine and develop models like Dolly, Databricks is setting the stage for a more inclusive AI-driven future.