[patched] - Maguro-003
Maguro-003 is licensed under a custom non-commercial research license, though commercial licenses are reportedly available for enterprises based in Japan or partnering with local universities. Based on metadata extracted from Hugging Face staging repositories:
| Property | Value | |----------|-------| | Format | JSONL, ShareGPT-style | | Size | 3.2 GB compressed | | Tokens | ~780M (Japanese: 92%, English: 7%, other: 1%) | | Avg response length | 128 tokens | | Train/validation split | 95/5 | | Toxicity filter threshold | 0.03 (using Japanese hate speech classifier) | maguro-003
Whether it becomes the gold standard or a footnote depends on adoption. But one thing is certain: in the race to build smaller, smarter, more respectful models, maguro-003 has set a new bar for what “premium” means. This article is based on available technical documentation, developer testimonials, and public code repositories as of April 14, 2026. The author has no affiliation with Wakaba Labs or any commercial AI entity. This article is based on available technical documentation,