StarCoder2 – open source code completion models

StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from The Stack v2 and some natural language text such as Wikipedia, Arxiv, and GitHub issues. The models use Grouped Query Attention, a context window of 16,384 tokens, with sliding window attention of 4,096 tokens. The 3B & 7B models were trained on 3+ trillion tokens, while the 15B was trained on 4+ trillion tokens. For more details check out the paper.

StarCoder2 @ Github

StarCoder2 is a family of open LLMs for code and comes in 3 different sizes with 3B, 7B and 15B parameters. The flagship StarCoder2-15B model is trained on over 4 trillion tokens and 600+ programming languages from The Stack v2. All models use Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and were trained using the Fill-in-the-Middle objective.

StarCoder2 offers three model sizes: a 3 billion-parameter model trained by ServiceNow, a 7 billion-parameter model trained by Hugging Face, and a 15 billion-parameter model trained by NVIDIA using NVIDIA NeMo on NVIDIA accelerated infrastructure:

StarCoder2 @ Hugging Face

 

From Medium :: Run Very Large Language Models on Your Computer | by Benjamin Marie | Towards AI

New large language models are publicly released almost every month. They are getting better and larger.

You may assume that these models can only be run on big clusters or in the cloud.

Fortunately, this is not the case. Recent versions of PyTorch propose several mechanisms that make the use of large language models relatively easy on a standard computer and without much engineering, thanks to the Hugging Face Accelerate package.

Source: Run Very Large Language Models on Your Computer | by Benjamin Marie | Towards AI