AI – <CONTENT /> v.6

March 8, 2024

StarCoder2 – open source code completion models

StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from The Stack v2 and some natural language text such as Wikipedia, Arxiv, and GitHub issues. The models use Grouped Query Attention, a context window of 16,384 tokens, with sliding window attention of 4,096 tokens. The 3B & 7B models were trained on 3+ trillion tokens, while the 15B was trained on 4+ trillion tokens. For more details check out the paper.

— StarCoder2 @ Github

StarCoder2 is a family of open LLMs for code and comes in 3 different sizes with 3B, 7B and 15B parameters. The flagship StarCoder2-15B model is trained on over 4 trillion tokens and 600+ programming languages from The Stack v2. All models use Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and were trained using the Fill-in-the-Middle objective.

StarCoder2 offers three model sizes: a 3 billion-parameter model trained by ServiceNow, a 7 billion-parameter model trained by Hugging Face, and a 15 billion-parameter model trained by NVIDIA using NVIDIA NeMo on NVIDIA accelerated infrastructure:

— StarCoder2 @ Hugging Face

October 25, 2023

💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups | by Thomas Wolf | HuggingFace | Medium

Training neural networks with larger batches in PyTorch: gradient accumulation, gradient checkpointing, multi-GPUs and distributed setups…

Source: 💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups | by Thomas Wolf | HuggingFace | Medium

October 25, 2023

Productionizing and scaling Python ML workloads simply | Ray

Scale your compute-intensive Python workloads. From reinforcement learning to large-scale model serving, Ray makes the power of distributed compute easy and accessible to every engineer.

Source: Productionizing and scaling Python ML workloads simply | Ray

October 23, 2023

Building a Multi-User Chatbot with Langchain and Pinecone in Next.JS

In this example, we’ll imagine that our chatbot needs to answer questions about the content of a website. To do that, we’ll need a way to store and access that information when the chatbot generates its response.

Source: Building a Multi-User Chatbot with Langchain and Pinecone in Next.JS

September 22, 2023

AudioCraft: A simple one-stop shop for audio modeling

AudioCraft is a simple framework that generates high-quality, realistic audio and music from text-based user inputs after training on raw audio signals as opposed to MIDI or piano rolls.

Source: AudioCraft: A simple one-stop shop for audio modeling

August 18, 2023

Notes on better search 8/18/2023

Goal: better, more focused search for www.cali.org.

In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.

Hints and pointers.

Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
- PostgreSQL – a powerful, open source object-relational database system
- QDrant – Vector Database for the next generation of AI applications.
- Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.

The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.

July 6, 2023July 6, 2023

Demystifying Text Data with the unstructured Python Library | Saeed Esmaili

In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.

To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.
Demystifying Text Data with the unstructured Python Library — https://saeedesmaili.com/demystifying-text-data-with-the-unstructured-python-library/

July 6, 2023July 6, 2023

AI Reading List 7/6/2023

What I’m reading today.

Researchers from Peking University Introduce ChatLaw: An Open-Source Legal Large Language Model with Integrated External Knowledge Bases — This includes links to the article and Github repo
Why Embeddings Usually Outperform TF-IDF: Exploring the Power of NLP
Fine-tune an LLM on your personal data: create a “The Lord of the Rings” storyteller
Open Assistant — In the same way that Stable Diffusion helped the world make art and images in new ways, we want to improve the world by providing amazing conversational AI. Github repo

July 5, 2023

AI Reading List 7/5/2023

The longer holiday weekend edition.

Opportunities and Risks of LLMs for Scalable Deliberation with Polis — Polis is a platform that leverages machine intelligence to scale up deliberative processes. In this paper, we explore the opportunities and risks associated with applying Large Language Models (LLMs) towards challenges with facilitating, moderating and summarizing the results of Polis engagements.
How I Use PandasAI to Complete 10 Most Frequent Tasks in Data Science —
A Quick Introduction and Development Guide For Pandas AI
Introduction to Haystack — Haystack is an open-source framework for building search systems that work intelligently over large document collections. Learn more about Haystack and how it works.
Master Semantic Search at Scale: Index Millions of Documents with Lightning-Fast Inference Times using FAISS and Sentence Transformers — Dive into an end-to-end demo of a high-performance semantic search engine leveraging GPU acceleration, efficient indexing techniques, and robust sentence encoders on datasets up to 1M documents, achieving 50 ms inference times
Natural Language to SQL using an Open Source LLM
Leveraging LangChain, Pinecone, and LLMs for Document Question Answering: An Integrated Approach — Document Question Answering (DQA) is a crucial task in Natural Language Processing(NLP), aiming to develop automated systems capable of understanding and extracting relevant information from textual documents to answer user queries. With recent advancements in Large Language Models (LLMs) like ChatGPT and innovative tools and technologies such as LangChain and Pinecone, a new integrated approach to DQA has emerged.
LlamaIndex: the ultimate LLM framework for indexing and retrieval — LlamaIndex, previously known as the GPT Index, is a remarkable data framework aimed at helping you build applications with LLMs by providing essential tools that facilitate data ingestion, structuring, retrieval, and integration with various application frameworks.

June 28, 2023

AI Reading List 6/28/2023

What I’m reading today.

Semantic Search with Few Lines of Code — Use the sentence transformers library to implement a semantic search engine in minutes
Choosing the Right Embedding Model: A Guide for LLM Applications — Optimizing LLM Applications with Vector Embeddings, affordable alternatives to OpenAI’s API and how we move from LlamaIndex to Langchain
Making a Production LLM Prompt for Text-to-SQL Translation

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30