NEW
Nous-Hermes 2 Mixtral 8x7B Surpasses Mixtral Instruct in Benchmark - Blockchain.News

Nous-Hermes 2 Mixtral 8x7B Surpasses Mixtral Instruct in Benchmark

Zach Anderson Jan 24, 2024 07:53

Nous-Hermes 2 Mixtral 8x7B surpasses Mixtral Instruct in benchmarks, offering SFT and SFT+DPO models with ChatML prompt format, enhancing AI performance and user experience.

Nous-Hermes 2 Mixtral 8x7B Surpasses Mixtral Instruct in Benchmark

The Large Language Model (LLM) known as Nous-Hermes 2 Mixtral 8x7B was recently presented by Nous Research when it was released. An important step forward in artificial intelligence capabilities is represented by the fact that this sophisticated model is the first one developed by the firm to be refined via the use of Reinforcement Learning from Human Feedback (RLHF). Furthermore, it is the first model to exceed the well-known Mixtral Instruct across a wide range of prominent benchmarks. 

The Nous-Hermes 2 Mixtral 8x7B is available in two distinct variants: the first is simply equipped with Supervised Fine-Tuning (SFT), while the second is a more advanced combination of SFT and Decentralised Policy Optimisation (DPO). An additional qlora adaptor that is tailored to the DPO version has also been made available by the firm. Through HuggingFace, these models are made available to the general public, giving people the chance to try and choose which choice is the most suitable for meeting their needs.

The performance of the model has been consistently good across a variety of benchmarks, with an average score of 75.70% in the ARC Challenge, AGIEval, and BigBench benchmarks. In particular, it was able to attain a high level of accuracy in tasks such as BoolQ, PIQA, and Winogrande. When it comes to engaging the LLM in multi-turn chat discussions, Nous-Hermes 2 makes use of ChatML as the prompt format, which provides a more systematic way for doing so. This format incorporates system prompts that enable steerability, so directing the model's rules, roles, and stylistic choices.

For the purpose of satisfying a wide range of VRAM constraints and inference quality criteria, the Nous-Hermes 2 model offers a variety of quantization choices, including 3-bit and 8-bit quantization, as well as a range of group sizes and act orders available.

Users are able to download and make use of the model by using the Hugging Face Hub Python library. Additionally, the library enables downloading from several branches in order to cater to different requirements. Those who are using the text-generation-webui are provided with an overview of a straightforward model download procedure, which makes it easier to obtain and make use of the model.

Putting it all together, Nous-Hermes 2 Mixtral 8x7B is a big step forward in the development of open-source artificial intelligence. It bridges the gap between proprietary and open-source AI solutions because to its better performance and user-friendly design, making it an appealing alternative for artificial intelligence applications.

Image source: Shutterstock