published by devthanos

LLMs

author
devthanos
date
slug
llms
status
Published
tags
Introduction
summary
Large Language Models
contentType
externalBlogLink
type
en
highlights
isCoverOnMain
isCoverOnMain
isFitCoverImage
isFitCoverImage
isNotice
isNotice
isVerified
isVerified
license
link
paymentTypes
price
coverRatio
lng
EN
Base Model
Variants
Description
OpenAI
gpt-3.5-turbo gpt-4 (coming soon)
Llama
meta-llama/Llama-2-70b-chat-hf meta-llama/Lllam-2-13b-chat-hf meta-llama/Llama-2-7b-chat-hf
Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. Llama 2 comes in a range of parameter sizes --- 7B, 13B, and 70B --- as well as pretrained and fine-tuned variations.
Llama
codellama/CodeLlama-34b-Instruct-hf codellama/CodeLlama-70b-Instruct-hf
Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the 70B parameter version, fine tuned for instructions. This model is designed for general code synthesis and understanding. Links to other models can be found in the index at the bottom.
mistral
mistralai/Mistral-7B-Instruct-v0.1
The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets. This model supports function calling and JSON mode.
mistral
mistralai/Mixtral-8x7B-Instruct-v0.1
The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.
mistral
Open-Orca/Mistral-7B-OpenOrca
We have used our own OpenOrca dataset to fine-tune on top of Mistral 7B. This dataset is our attempt to reproduce the dataset generated for Microsoft Research's Orca Paper. We use OpenChat packing, trained with Axolotl. This release is trained on a curated filtered subset of most of our GPT-4 augmented data. It is the same subset of our data as was used in our OpenOrcaxOpenChat-Preview2-13B model. HF Leaderboard evals place this model as #1 for all models smaller than 30B at release time, outperforming all other 7B and 13B models! This release provides a first: a fully open model with class-breaking performance, capable of running fully accelerated on even moderate consumer GPUs. Our thanks to the Mistral team for leading the way here. We affectionately codename this model: "MistralOrca"
mistral
HuggingFaceH4/zephyr-7b-beta
Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removing the in-built alignment of these datasets boosted performance on MT Bench and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so.