Am I the only one who is really impressed by Granite4 from IBM?

Possibly linux@lemmy.zip · edit-2 1 day ago

Am I the only one who is really impressed by Granite4 from IBM?

SmokeyDope@lemmy.world · edit-2 23 hours ago

Havent heard of it till this post, what about it impressed you over something like llama, mistral, qwen?

for anyone who wants more info its a 7b Mixture of Experts model released under apache 2.0!

" Granite-4-Tiny-Preview is a 7B parameter fine-grained hybrid mixture-of-experts (MoE) instruct model fine-tuned from Granite-4.0-Tiny-Base-Preview using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised fine-tuning, and model alignment using reinforcement learning."

Supported Languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may fine-tune this Granite model for languages beyond these 12 languages.

Intended Use: This model is designed to handle general instruction-following tasks and can be integrated into AI assistants across various domains, including business applications.

Capabilities

Thinking
Summarization
Text classification
Text extraction
Question-answering
Retrieval Augmented Generation (RAG)
Code related tasks
Function-calling tasks
Multilingual dialog use cases
Long-context tasks including long document/meeting summarization, long document QA, etc.

https://huggingface.co/ibm-granite/granite-4.0-tiny-preview

Xylight@lemdro.id · edit-2 11 hours ago

there’s also a “small” and “micro” variant, which are 32b a6b MoE and 3b dense models respectively