Access instruction-tuned and preference-aligned language models for chat and task completion
Instella-3B-SFT and Instella-3B-Instruct variants provide supervised fine-tuning and direct preference optimization (DPO) alignment. Users can use these variants for instruction-following, conversational tasks, and problem-solving without relying on closed-source models or commercial APIs.
