Mistral Large is Officially Released – Partners With Microsoft

Mistral has finally released their largest model to date, Mistral Large. It’s a cutting-edge language model with top-tier reasoning capabilities. It is proficient in English, French, Spanish, German, and Italian, excelling in tasks like text understanding, transformation, and code generation. Mistral Large ranks as the world’s second model available through an API, just after GPT-4. It offers a 32K tokens context window for precise information recall and supports function calling. Mistral AI has partnered with Microsoft to make their models available on Azure, providing access through Azure AI Studio and Azure Machine Learning. Mistral Large outperforms other models in multilingual tasks and excels in coding and math challenges. You can test the model yourself on their site.

Mistral Comparison

Mistral Large is a cutting-edge text generation model with top-tier reasoning capabilities. This comes just after Mistral released their 7B model late last year. They really seem to be moving fast, only shortly after, they released their 8x7B MoE model. This new Mistral Large model excels in complex multilingual tasks like text understanding, transformation, and code generation. It ranks as the world’s second-best model available through an API, just after GPT-4. Detailed benchmarks show its strong performance on various tasks, making it a powerful tool for developers and researchers.Key Features of Mistral Large:

  1. Multilingual Proficiency: Fluent in English, French, Spanish, German, and Italian with a deep understanding of grammar and cultural nuances.
  2. Large Context Window: With a 32K tokens context window, it can recall precise information from extensive documents.
  3. Precise Instruction-Following: Enables developers to create custom moderation policies efficiently, as demonstrated in setting up system-level moderation for le Chat.
  4. Function Calling Capability: In-built function calling ability combined with constrained output mode on la Plateforme facilitates application development and modernization of tech stacks at scale.

Side note, Mistral-Large is priced ~20% cheaper than GPT-4-Turbo. It’s a slightly weaker model as well. Curious to see how things play out and whether this is a worthwhile trade-off for many applications. Any interesting question will be if this 20% will be enough of a selling point?

Mistral Large Reasoning Capabilities

Mistral Large’s performance is compared to the top-leading LLM models on commonly used benchmarks, showcasing its powerful reasoning capabilities. The figure in question reports the performance of pre-trained models on standard benchmarks.

Mistral-Microsoft Partnership

The partnership between Microsoft and Mistral AI aims to accelerate AI innovation by leveraging Azure’s cutting-edge AI infrastructure to develop and deploy next-generation large language models (LLMs). Mistral AI’s flagship commercial model, Mistral Large, is now available on Azure AI, offering state-of-the-art reasoning and knowledge capabilities for various text-based applications. This collaboration focuses on supercomputing infrastructure support, scaling premium models through Models as a Service (MaaS), and exploring AI research and development opportunities, including training purpose-specific models for select customers like the European public sector. Here is a tweet by Microsoft’s CEO Satya Nadella.

This partnership between Microsoft and Mistral AI is particularly interesting, considering Microsoft’s significant investment and role as a computing provider to OpenAI. The collaboration brings together the strengths of both companies, with Mistral AI focusing on developing advanced large language models and Microsoft providing its powerful Azure AI infrastructure.

The previous two models by Mistral is seen as a positive example of open sourcing leading to commercial success with LLMs. However, some may feel conflicted due to the company’s strong pro open source stance and the potential influence of Microsoft after acquiring an interest. There is uncertainty about Mistral’s future open sourcing practices. It is suggested that if they stop, releasing the full weights of Miqu for community fine-tuning would be a good gesture, especially since Mixtral was disappointing in tuning.

Closing Thoughts

Another set of releases and, again, no AI has definitively beat GPT-4, which was in private beta well over a year ago Gemini Advanced is the only one of similar level, Mistral Large is below. On deck possibilities: Gemini 1.5 Ultra… and GPT-5. (Maybe Llama 3? Grok 2? Claude 3?). Sadly, they didn’t choose to open-source Mistral medium. Previously, Mistral AI offered open-source models like open-mistral-7B and open-mixtral-8x7b, aligning with their earlier promise of openness and contributing to the open-source community. Despite moving towards a more commercially oriented stance, Mistral AI still maintains elements of openness, allowing users to deploy and manage their models independently, supporting portability across clouds and infrastructures, and enabling extensive customizations and fine-tuning capacity.

Mistral had always maintained that they would retain the largest models for their own use. In all honesty, it would be a foolish decision if they were to simply replicate OpenAI’s approach. Although Mistral Large is a capable model, it falls short of GPT-4 in terms of intelligence and lacks the flexibility of Gemini 1.5 Pro. Therefore, it wouldn’t be logical to invest in the third-best option when there are alternatives available that offer superior intelligence and a larger context window.

Related

How to 10x Your LLM Prompting With DSPy

Tired of spending countless hours tweaking prompts for large...

Google Announces A Cost Effective Gemini Flash

At Google's I/O event, the company unveiled Gemini Flash,...

WordPress vs Strapi: Choosing the Right CMS for Your Needs

With the growing popularity of headless CMS solutions, developers...

JPA vs. JDBC: Comparing the two DB APIs

Introduction The eternal battle rages on between two warring database...

Meta Introduces V-JEPA

The V-JEPA model, proposed by Yann LeCun, is a...

Subscribe to our AI newsletter. Get the latest on news, models, open source and trends.
Don't worry, we won't spam. 😎

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

Lusera will use the information you provide on this form to be in touch with you and to provide updates and marketing.