AI – <CONTENT /> v.6

October 30, 2024

Working with David Colarusso’s LIT Prompts Browser Extension

Today’s adventures in generative AI included using David Colarusso’s LIT Prompts Chrome extension connected to the OpenAI gpt-4o-mini API on Chapter 4 of Alice Ristroph’s Criminal Law: An Integrated Approach. I first requested a short 150 word summary of the chapter which is one of the default prompts in this extension. Then using info from that summary I asked questions about Winship. The answers provided by the bot are all correct, though the holding response is a bit wordy, probably because I asked for references to the text. For reference see the Wikipedia article on Winship.

I can see using a a variant of this extension, think WordPress plugin or Drupal module, with a self-hosted model or a “bring your own” API key scheme as a tool used to supplement CALI’s educational resources to help students customize their learning.

July 26, 2024

Using generative AI to create a contracts quiz

Using the latest version of LM-Studio I loaded up Mistral-Nemo-Instruct-2407-GGUF from Hugging Face and using the Quiz Creator prompt from Prompts for Instructors (modified to include law school as an option) I was able to generate the following quiz in less than 15 minutes. Note: I did need to ask it to generate model answers for the open ended questions but I think the prompt can be further modified to include that feature.

System:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

User:

You are a helpful teaching assistant and an expert in assessment. You create diagnostic quizzes that comprise of multiple choice and open ended questions that test student knowledge. You only ask 2 questions at a time and keep your part of the conversation brief. First introduce yourself to the teacher and ask them what topic they teach and the learning level of their students (high school, college, graduate school, or law school). Number the questions. Wait for the teacher to respond. Do not move on until the teacher responds. Do not ask any other questions until the teacher responds. Do not mention topics or documents until the teacher responds to the first two questions. Only once you have the answers to the first two questions then go ahead and ask the teacher what specifically (in 2 or 3 points) students should understand about this specific topic and what sticking points or difficulties students might have. This will help you construct the test. Wait for the teacher to respond. Then go ahead and create a quiz with 5 multiple choice questions and 2 open ended questions. The questions should be arranged from easiest to most difficult. Questions should test for rote knowledge and ask students to apply their knowledge. Do not focus on sticking points only. Every incorrect choice in the multiple choice questions should be plausible. Do not use an “all of the above” option in any of the questions and do not use negative framing. When applicable, open ended questions should prompt students to apply their knowledge and explain concepts in their own words and should include a metacognitive element eg explain why you think this? What assumptions are you making? Make the test nicely formatted for the students. Also give the teacher an answer key. Explain your reasoning for each question and let that teacher know that this is a draft and that you are happy to work with them to refine the questions. You also can explain that your job is to help them assess student knowledge and that you view a test as both useful for assessment and as a learning event, to help student see the gap in their knowledge and give them an opportunity to recall what they know (retrieval practice).

Assistant:

Hello teacher! I’m here to assist you with creating a diagnostic quiz. To start, could you please tell me:

1. What subject do you teach?
2. At which learning level are your students? (High school, college, graduate school, law school)
User:

I am teaching contract law to students in law school.

Assistant:

Great! Now, to help me create targeted questions, could you also share:

1. The specific aspects of contract law you’d like the quiz to cover?
2. Any common misconceptions or sticking points students might have regarding this topic?
User:

1. I would like the quiz ro cover offer and acceptance, the duration of an offer, and remedies for breach of contract
2. Students often have trouble dealing the difference between express and implied contracts

Assistant:

Thank you for the additional information. Here’s a draft diagnostic quiz based on your inputs:

—

**Contract Law Diagnostic Quiz**

1. **Multiple Choice:** An offer to make a contract remains open for acceptance until:
A) The offeror revokes it
B) A reasonable time has passed
C) It is accepted
D) It is communicated to the offeree

2. **Multiple Choice:** Which of the following statements about implied contracts is true?
A) An implied contract can only be verbal
B) Implied contracts are always based on the conduct of both parties
C) Implied contracts must be in writing
D) Implied contracts require a specific agreement

3. **Multiple Choice:** If Party A breaches a valid contract with Party B, and the breach was minor (not material), which remedy is Party B most likely to seek?
A) Damages for breach of contract
B) Specific performance
C) Rescission and restitution
D) Injunction

4. **Multiple Choice:** An offer can be accepted by:
A) Email, but not fax
B) Mail, but not email
C) Any reasonable means of communication
D) Only in person

5. **Multiple Choice:** Which of the following is NOT a valid reason for an offeree to reject an offer?
A) The offeree changes their mind
B) The offer has lapsed due to a time limit
C) The consideration provided by the offeree is insufficient
D) The offer was made in consideration of something illegal

6. **Open-Ended:** Explain why it’s important for an offeror to clearly state the duration of their offer. What assumptions are you making about the offeree’s knowledge and intentions in this scenario?

7. **Open-Ended:** Distinguish between express contracts and implied contracts. Provide a brief example of each type, and explain under what circumstances each type might be more likely to occur.

—

**Answer Key:**

1(A), 2(B), 3(A), 4(C), 5(D)

—
User:

Write a model answer for question 6

Assistant:

**Model Answer – Question 6:**

It’s important for an offeror to clearly state the duration of their offer to avoid misunderstandings and potential disputes with the offeree. By explicitly stating when an offer expires, the offeror makes it clear to the offeree that they need to act promptly if they wish to accept the offer. This promotes fairness and certainty in contract formation.

Assumptions made about the offeree’s knowledge and intentions include:

* The offeree is aware of the time limit placed on the offer.
* The offeree understands that if they do not accept the offer within the specified duration, it may lapse or be revoked by the offeror.
* The offeree has enough information to make an informed decision about whether to accept the offer within the given time frame.
User:

Write a model answer for question 7

Assistant:

**Model Answer – Question 7:**

**Express Contracts:**
– An express contract is a agreement where the terms are explicitly stated by the parties, either orally or in writing.
– Example: A and B agree on a written employment contract outlining salary, benefits, work duties, etc. Here, both parties know exactly what they’re agreeing to due to the clear articulation of terms.
– Express contracts are more likely to occur when:
– The agreement involves complex or important matters (e.g., real estate, business transactions).
– Parties want to ensure clarity and minimize misunderstandings.
– There’s a need for legal enforceability.

**Implied Contracts:**
– An implied contract arises from the conduct of the parties, even if there’s no explicit agreement. The actions and circumstances indicate an intent to enter into a contractual relationship.
– Example: A painter does work on B’s house without a written contract. If B pays A for the work done, an implied contract exists based on their actions and the surrounding circumstances.
– Implied contracts are more likely to occur when:
– The agreement involves simple or routine matters (e.g., everyday transactions).
– Parties act in reliance on each other’s promises or conduct.
– There’s no need for a formal, written agreement.

Some notes:

The questions and answers look fine but I’m not a contracts teacher.
This is all open source software that is free to use, no need for subscriptions or expensive API keys.
It runs on off the shelf hardware. It will take longer to run on a less powerful machine but it will run
Ultimately anyone could do this on pretty much any machine

March 8, 2024

StarCoder2 – open source code completion models

StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from The Stack v2 and some natural language text such as Wikipedia, Arxiv, and GitHub issues. The models use Grouped Query Attention, a context window of 16,384 tokens, with sliding window attention of 4,096 tokens. The 3B & 7B models were trained on 3+ trillion tokens, while the 15B was trained on 4+ trillion tokens. For more details check out the paper.

— StarCoder2 @ Github

StarCoder2 is a family of open LLMs for code and comes in 3 different sizes with 3B, 7B and 15B parameters. The flagship StarCoder2-15B model is trained on over 4 trillion tokens and 600+ programming languages from The Stack v2. All models use Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and were trained using the Fill-in-the-Middle objective.

StarCoder2 offers three model sizes: a 3 billion-parameter model trained by ServiceNow, a 7 billion-parameter model trained by Hugging Face, and a 15 billion-parameter model trained by NVIDIA using NVIDIA NeMo on NVIDIA accelerated infrastructure:

— StarCoder2 @ Hugging Face

October 25, 2023

💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups | by Thomas Wolf | HuggingFace | Medium

Training neural networks with larger batches in PyTorch: gradient accumulation, gradient checkpointing, multi-GPUs and distributed setups…

Source: 💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups | by Thomas Wolf | HuggingFace | Medium

October 25, 2023

Productionizing and scaling Python ML workloads simply | Ray

Scale your compute-intensive Python workloads. From reinforcement learning to large-scale model serving, Ray makes the power of distributed compute easy and accessible to every engineer.

Source: Productionizing and scaling Python ML workloads simply | Ray

October 23, 2023

Building a Multi-User Chatbot with Langchain and Pinecone in Next.JS

In this example, we’ll imagine that our chatbot needs to answer questions about the content of a website. To do that, we’ll need a way to store and access that information when the chatbot generates its response.

Source: Building a Multi-User Chatbot with Langchain and Pinecone in Next.JS

September 22, 2023

AudioCraft: A simple one-stop shop for audio modeling

AudioCraft is a simple framework that generates high-quality, realistic audio and music from text-based user inputs after training on raw audio signals as opposed to MIDI or piano rolls.

Source: AudioCraft: A simple one-stop shop for audio modeling

August 18, 2023

Notes on better search 8/18/2023

Goal: better, more focused search for www.cali.org.

In general the plan is to scrape the site to a vector database, enable embeddings of the vector db in Llama 2, provide API endpoints to search/find things.

Hints and pointers.

Llama2-webui – Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
FastAPI – web framework for building APIs with Python 3.7+ based on standard Python type hints
Danswer – Ask Questions in natural language and get Answers backed by private sources. It makes use of
- PostgreSQL – a powerful, open source object-relational database system
- QDrant – Vector Database for the next generation of AI applications.
- Typesense – a modern, privacy-friendly, open source search engine built from the ground up using cutting-edge search algorithms, that take advantage of the latest advances in hardware capabilities.

The challenge is to wire together these technologies and then figure out how to get it to play nice with Drupal. One possibility is just to build this with an API and then use the API to interact with Drupal. That approach also offers the possibility of allowing the membership to interact with the API too.

July 6, 2023July 6, 2023

Demystifying Text Data with the unstructured Python Library | Saeed Esmaili

In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.

To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.
Demystifying Text Data with the unstructured Python Library — https://saeedesmaili.com/demystifying-text-data-with-the-unstructured-python-library/

July 6, 2023July 6, 2023

AI Reading List 7/6/2023

What I’m reading today.

Researchers from Peking University Introduce ChatLaw: An Open-Source Legal Large Language Model with Integrated External Knowledge Bases — This includes links to the article and Github repo
Why Embeddings Usually Outperform TF-IDF: Exploring the Power of NLP
Fine-tune an LLM on your personal data: create a “The Lord of the Rings” storyteller
Open Assistant — In the same way that Stable Diffusion helped the world make art and images in new ways, we want to improve the world by providing amazing conversational AI. Github repo

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31