ALS IAS - Best IAS Coaching in Delhi, UPSC Coaching in Delhi

Current Affairs Daily

Microsoft Phi-3-Mini

25 Apr 2024

Tags:

Why is it in the News?

A few days after Meta unveiled its Llama 3 Large Language Model (LLM), Microsoft recently unveiled the latest version of its ‘lightweight’ AI model – the Phi-3-Mini.

What is Phi-3-Mini?

Phi-3 refers to a series of language models developed by Microsoft, with Phi-3-mini being a notable addition.
Phi-3-mini is a 3.8 billion parameter language model trained on 3.3 trillion tokens, designed to be as powerful as larger models while being small enough to be deployed on a phone.
Despite its compact size, Phi-3-mini boasts impressive performance, rivaling that of larger models such as ChatGPT-3.5.
Furthermore, Phi-3-mini can be quantized to 4 bits, occupying approximately 1.8GB of memory, making it suitable for deployment on mobile devices.
The model’s training data, a scaled-up version of the one used for Phi-2, is composed of heavily filtered web data and synthetic data, contributing to its remarkable capabilities.

Advantages and Challenges of Phi-3-Mini:

Phi-3-mini exhibits strengths in its compact size, impressive performance, and the ability to be deployed on mobile devices.
- Its training with high-quality data and chat-finetuning contribute to its success. This allows it to rival larger models in language understanding and reasoning.
However, the model is fundamentally limited by its size for certain tasks.
- It cannot store extensive “factual knowledge,” leading to lower performance on tasks such as TriviaQA.
- Nevertheless, efforts to resolve this weakness are underway, including augmentation with a search engine and exploring multilingual capabilities for Small Language Models.
Safety: Phi-3-mini was developed with a strong emphasis on safety and responsible AI principles, in alignment with Microsoft’s guidelines.
- The approach to ensuring safety involved various measures such as safety alignment in post-training, red-teaming, and automated testing.
- It also involved evaluations across multiple categories of responsible AI (RAI) harm.

How is Phi-3-Mini Different From LLMs?

Phi-3-mini is the Small Language Model (SLM). Simply, SLMs are more streamlined versions of large language models.
When compared to Large Language Model (LLM), smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.
SLMs are great for “resource-constrained environments including on-device and offline inference scenarios.
Such models are good for scenarios where fast response times are critical, say for chatbots or virtual assistants.
Moreover, they are ideal for cost-constrained use cases, particularly with simpler tasks.
While LLMs are trained on massive general data, SLMs stand out with their specialisation.
Through fine-tuning, SLMs can be customised for specific tasks and achieve accuracy and efficiency in doing them.
Most SLMs undergo targeted training, demanding considerably less computing power and energy compared to LLMs.
SLMs also differ when it comes to inference speed and latency.
Their compact size allows for quicker processing and their cost makes them appealing to smaller organisations and research groups.

CAD

Microsoft Phi-3-Mini

Tags: