Mervin Praison

Ollama

Ollama Modelfile Llama Qwen Mistral Phi deepseek-r1

Post author By praison
Post date February 4, 2025

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM llama3.2:latest

FROM /Users/praison/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff
TEMPLATE """<|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023

{{ if .System }}{{ .System }}
{{- end }}
{{- if .Tools }}When you receive a tool call response, use the output to format an answer to the orginal user question.

You are a helpful assistant with tool calling capabilities.
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{{ range $.Tools }}
{{- . }}
{{ end }}
{{ .Content }}<|eot_id|>
{{- else }}

{{ .Content }}<|eot_id|>
{{- end }}{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{ range .ToolCalls }}
{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}

{{ .Content }}
{{- end }}{{ if not $last }}<|eot_id|>{{ end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>

{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- end }}
{{- end }}"""
PARAMETER stop <|start_header_id|>
PARAMETER stop <|end_header_id|>
PARAMETER stop <|eot_id|>

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM qwen2.5-large:7b

FROM /Users/praison/.ollama/models/blobs/sha256-ced7796abcbb47ef96412198ebd31ac1eca21e8bbc831d72a31df69e4a30aad5
TEMPLATE """{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|>
{{- else if .Messages }}
{{- if or .System .Tools }}<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end }}<|im_end|>
{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ if .Content }}{{ .Content }}
{{- else if .ToolCalls }}<tool_call>
{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{ end }}</tool_call>
{{- end }}{{ if not $last }}<|im_end|>
{{ end }}
{{- else if eq .Role "tool" }}<|im_start|>user
<tool_response>
{{ .Content }}
</tool_response><|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
{{ end }}
{{- end }}
{{- else }}
{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""
SYSTEM You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
PARAMETER num_ctx 32768
PARAMETER stop <|endoftext|>

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM mistral:latest

FROM /Users/praison/.ollama/models/blobs/sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435
TEMPLATE [INST] {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }} [/INST]
PARAMETER stop [INST]
PARAMETER stop [/INST]

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM phi4:latest

FROM /Users/praison/.ollama/models/blobs/sha256-fd7b6731c33c57f61767612f56517460ec2d1e2e5a3f0163e0eb3d8d8cb5df20
TEMPLATE """{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
<|im_start|>{{ .Role }}<|im_sep|>
{{ .Content }}{{ if not $last }}<|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_end|>
<|im_start|>assistant<|im_sep|>
{{ end }}
{{- end }}"""
PARAMETER stop <|im_start|>
PARAMETER stop <|im_end|>
PARAMETER stop <|im_sep|>

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM deepseek-r1:latest

FROM /Users/praison/.ollama/models/blobs/sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49
TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<｜User｜>{{ .Content }}
{{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
{{- end }}"""
PARAMETER stop <｜begin▁of▁sentence｜>
PARAMETER stop <｜end▁of▁sentence｜>
PARAMETER stop <｜User｜>
PARAMETER stop <｜Assistant｜>

Dataset

Replicate Huggingface Dataset for testing

Post author By praison
Post date February 3, 2025

import os
from datasets import load_dataset

# Retrieve your Hugging Face token from the environment
hf_token = os.getenv("HF_TOKEN")
if hf_token is None:
    raise ValueError("Please set your HF_TOKEN environment variable with your Hugging Face token.")

# Load the dataset (using the 'train' split by default)
dataset = load_dataset("mlabonne/FineTome-100k", split='train')

# Select the first 10 rows
subset = dataset.select(range(10))

# Push the subset to your Hugging Face repository
subset.push_to_hub("mervinpraison/FineTome-10rows", token=hf_token)

Praison AI

Node.js AI Agents Framework

Post author By praison
Post date January 31, 2025

https://docs.praison.ai/js/nodejs

npm install praisonai

export OPENAI_API_KEY=xxxxxxxxxx

Single Agent

import { Agent } from 'praisonai';

// Single agent example - Science Explainer
const agent = new Agent({ 
  instructions: `You are a science expert who explains complex phenomena in simple terms.
Provide clear, accurate, and easy-to-understand explanations.`,
  name: "ScienceExplainer",
  verbose: true
});

agent.start("Why is the sky blue?")
  .then(response => {
    console.log('\nExplanation:');
    console.log(response);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Multi Agents

import { Agent, PraisonAIAgents } from 'praisonai';

// Create story agent
const storyAgent = new Agent({
  instructions: "You are a storyteller. Write a very short story (2-3 sentences) about a given topic.",
  name: "StoryAgent",
  verbose: true
});

// Create summary agent
const summaryAgent = new Agent({
  instructions: "You are an editor. Create a one-sentence summary of the given story.",
  name: "SummaryAgent",
  verbose: true
});

// Create and start agents
const agents = new PraisonAIAgents({
  agents: [storyAgent, summaryAgent],
  tasks: [
    "Write a short story about a cat",
    "{previous_result}"  // This will be replaced with the story
  ],
  verbose: true
});

agents.start()
  .then(results => {
    console.log('\nStory:', results[0]);
    console.log('\nSummary:', results[1]);
  })
  .catch(error => console.error('Error:', error));

Task Based Agents

import { Agent, PraisonAIAgents } from 'praisonai';

// Create recipe agent
const recipeAgent = new Agent({
  instructions: `You are a professional chef and nutritionist. Create 5 healthy food recipes that are both nutritious and delicious.
Each recipe should include:
1. Recipe name
2. List of ingredients with quantities
3. Step-by-step cooking instructions
4. Nutritional information
5. Health benefits

Format your response in markdown.`,
  name: "RecipeAgent",
  verbose: true
});

// Create blog agent
const blogAgent = new Agent({
  instructions: `You are a food and health blogger. Write an engaging blog post about the provided recipes.
The blog post should:
1. Have an engaging title
2. Include an introduction about healthy eating
3. Discuss each recipe and its unique health benefits
4. Include tips for meal planning and preparation
5. End with a conclusion encouraging healthy eating habits

Here are the recipes to write about:
{previous_result}

Format your response in markdown.`,
  name: "BlogAgent",
  verbose: true
});

// Create PraisonAIAgents instance with tasks
const agents = new PraisonAIAgents({
  agents: [recipeAgent, blogAgent],
  tasks: [
    "Create 5 healthy and delicious recipes",
    "Write a blog post about the recipes"
  ],
  verbose: true
});

// Start the agents
agents.start()
  .then(results => {
    console.log('\nFinal Results:');
    console.log('\nRecipe Task Results:');
    console.log(results[0]);
    console.log('\nBlog Task Results:');
    console.log(results[1]);
  })
  .catch(error => {
    console.error('Error:', error);
  });

Praison AI

TypeScript AI Agents Framework

Post author By praison
Post date January 30, 2025

https://docs.praison.ai/js/js

npm install praisonai

export OPENAI_API_KEY=xxxxxxxxxxx

Single AI Agent

import { Agent, PraisonAIAgents } from 'praisonai';

async function main() {
    // Create a simple agent (no task specified)
    const agent = new Agent({
        name: "BiologyExpert",
        instructions: "Explain the process of photosynthesis in detail.",
        verbose: true
    });

    // Run the agent
    const praisonAI = new PraisonAIAgents({
        agents: [agent],
        tasks: ["Explain the process of photosynthesis in detail."],
        verbose: true
    });

    try {
        console.log('Starting single agent example...');
        const results = await praisonAI.start();
        console.log('\nFinal Results:', results);
    } catch (error) {
        console.error('Error:', error);
    }
}

// Run the example
if (require.main === module) {
    main();
}

Multi AI Agent

import { Agent, PraisonAIAgents } from 'praisonai';

async function main() {
    // Create multiple agents with different roles
    const researchAgent = new Agent({
        name: "ResearchAgent",
        instructions: "Research and provide detailed information about renewable energy sources.",
        verbose: true
    });

    const summaryAgent = new Agent({
        name: "SummaryAgent",
        instructions: "Create a concise summary of the research findings about renewable energy sources. Use {previous_result} as input.",
        verbose: true
    });

    const recommendationAgent = new Agent({
        name: "RecommendationAgent",
        instructions: "Based on the summary in {previous_result}, provide specific recommendations for implementing renewable energy solutions.",
        verbose: true
    });

    // Run the agents in sequence
    const praisonAI = new PraisonAIAgents({
        agents: [researchAgent, summaryAgent, recommendationAgent],
        tasks: [
            "Research and analyze current renewable energy technologies and their implementation.",
            "Summarize the key findings from the research.",
            "Provide actionable recommendations based on the summary."
        ],
        verbose: true,
        process: 'sequential'  // Agents will run in sequence, passing results to each other
    });

    try {
        console.log('Starting multi-agent example...');
        const results = await praisonAI.start();
        console.log('\nFinal Results:');
        console.log('Research Results:', results[0]);
        console.log('\nSummary Results:', results[1]);
        console.log('\nRecommendation Results:', results[2]);
    } catch (error) {
        console.error('Error:', error);
    }
}

// Run the example
if (require.main === module) {
    main();
}

Ollama

Ollama Reasoning Chatbot with Gradio UI

Post author By praison
Post date January 30, 2025

Install

pip install -U ollama gradio

Basic

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
)

# Access message content directly from response
response = completion['message']['content']

print(response)

Streaming

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
    stream=True  # Enable streaming
)

# Print the response as it comes in
for chunk in completion:
    if 'message' in chunk and 'content' in chunk['message']:
        content = chunk['message']['content']
        print(content, end='', flush=True)

Gradio

import ollama
import gradio as gr

def chat_with_ollama(message, history):
    # Initialize empty string for streaming response
    response = ""
    
    # Convert history to messages format
    messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]
    
    # Add history messages
    for h in history:
        messages.append({"role": "user", "content": h[0]})
        if h[1]:  # Only add assistant message if it exists
            messages.append({"role": "assistant", "content": h[1]})
    
    # Add current message
    messages.append({"role": "user", "content": message})
    
    completion = ollama.chat(
        model="deepseek-r1:latest",
        messages=messages,
        stream=True  # Enable streaming
    )
    
    # Stream the response
    for chunk in completion:
        if 'message' in chunk and 'content' in chunk['message']:
            content = chunk['message']['content']
            # Handle <think> and </think> tags
            content = content.replace("<think>", "Thinking...").replace("</think>", "\n\n Answer:")
            response += content
            yield response

# Create Gradio interface with Chatbot
with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox(placeholder="Enter your message here...")
    clear = gr.Button("Clear")

    def user(user_message, history):
        return "", history + [[user_message, None]]

    def bot(history):
        history[-1][1] = ""
        for chunk in chat_with_ollama(history[-1][0], history[:-1]):
            history[-1][1] = chunk
            yield history

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        bot, chatbot, chatbot
    )
    clear.click(lambda: None, None, chatbot, queue=False)

if __name__ == "__main__":
    demo.launch()

Chatbot

LM Studio Reasoning Chatbot Code

Post author By praison
Post date January 30, 2025

Installing

pip install -U chainlit openai streamlit gradio

Basic

# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

completion = client.chat.completions.create(
  model="deepseek-r1-distill-qwen-7b",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Give me a meal plan for today."}
  ],
  temperature=0.7,
)

response = completion.choices[0].message.content

print(response)

Streaming

# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

# Create streaming completion
completion = client.chat.completions.create(
    model="deepseek-r1-distill-qwen-7b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Give me a meal plan for today."}
    ],
    temperature=0.7,
    stream=True  # Enable streaming
)

# Process the streaming response
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        # Print the content as it comes in
        print(content, end='', flush=True)

LM Studio Streamlit

import streamlit as st
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

# Set page title
st.title("Chat with LM Studio")

# Initialize chat history in session state if it doesn't exist
if "messages" not in st.session_state:
    st.session_state.messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]

# Display chat input
user_input = st.chat_input("Your message:")

# Display chat history and handle new inputs
for message in st.session_state.messages:
    if message["role"] != "system":
        with st.chat_message(message["role"]):
            st.write(message["content"])

if user_input:
    # Display user message
    with st.chat_message("user"):
        st.write(user_input)
    
    # Add user message to history
    st.session_state.messages.append({"role": "user", "content": user_input})
    
    # Get streaming response
    with st.chat_message("assistant"):
        message_placeholder = st.empty()
        full_response = ""
        
        completion = client.chat.completions.create(
            model="deepseek-r1-distill-qwen-7b",
            messages=st.session_state.messages,
            temperature=0.7,
            stream=True
        )
        
        # Process the streaming response
        for chunk in completion:
            if chunk.choices[0].delta.content is not None:
                full_response += chunk.choices[0].delta.content
                message_placeholder.write(full_response + "▌")
        
        message_placeholder.write(full_response)
    
    # Add assistant response to history
    st.session_state.messages.append({"role": "assistant", "content": full_response})

Gradio

import gradio as gr
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

def generate_response(message, history):
    # Convert history from tuples to message format
    messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]
    
    # Add history messages
    for user_msg, assistant_msg in history:
        messages.append({"role": "user", "content": user_msg})
        if assistant_msg:  # Only add assistant message if it exists
            messages.append({"role": "assistant", "content": assistant_msg})
    
    # Add current message
    messages.append({"role": "user", "content": message})

    # Create streaming completion
    completion = client.chat.completions.create(
        model="deepseek-r1-distill-qwen-7b",
        messages=messages,
        temperature=0.7,
        stream=True
    )

    # Process the streaming response
    partial_message = ""
    for chunk in completion:
        # Check for content in the delta
        if hasattr(chunk.choices[0].delta, 'content') and chunk.choices[0].delta.content is not None:
            content = chunk.choices[0].delta.content
            # Handle <think> and </think> tags
            content = content.replace("<think>", "Thinking...").replace("</think>", "\n\n Answer:")
            partial_message += content
            yield partial_message

# Create the Gradio interface with Blocks
with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.Button("Clear")

    def user(user_message, history):
        return "", history + [[user_message, None]]

    def bot(history):
        history[-1][1] = ""
        for chunk in generate_response(history[-1][0], history[:-1]):
            history[-1][1] = chunk
            yield history

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        bot, chatbot, chatbot
    )
    clear.click(lambda: None, None, chatbot, queue=False)

if __name__ == "__main__":
    demo.launch()

Detailed

# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

completion = client.chat.completions.create(
  model="deepseek-r1-distill-qwen-7b",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Give me a meal plan for today."}
  ],
  temperature=0.7,
)

response = completion.choices[0].message.content

print(response)
# Split and print think and main response separately
if "<think>" in response:
    think_part = response.split("</think>")[0].replace("<think>", "").strip()
    main_response = response.split("</think>")[1].strip()
    
    print("Thinking process:")
    print("-" * 50)
    print(think_part)
    print("\nMain response:")
    print("-" * 50)
    print(main_response)
else:
    print(response)

# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

# Create streaming completion
completion = client.chat.completions.create(
    model="deepseek-r1-distill-qwen-7b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Give me a meal plan for today."}
    ],
    temperature=0.7,
    stream=True  # Enable streaming
)

# Variables to store think and main response parts
think_content = ""
main_content = ""
current_section = "think"  # Track which section we're currently building

# Process the streaming response
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        
        # Print the content as it comes in
        print(content, end='', flush=True)
        
        # Check for section transition
        if "</think>" in content:
            parts = content.split("</think>")
            think_content += parts[0]
            current_section = "main"
            if len(parts) > 1:
                main_content += parts[1]
        else:
            # Add content to appropriate section
            if current_section == "think":
                think_content += content
            else:
                main_content += content

print("\n\n" + "="*50 + "\nFinal formatted output:\n" + "="*50)

# Print the final results
if think_content:
    print("\nThinking process:")
    print("-" * 50)
    print(think_content.replace("<think>", "").strip())
    print("\nMain response:")
    print("-" * 50)
    print(main_content.strip())
else:
    # If no think/main separation, print everything as main content
    print(main_content)

Ollama

Ollama Reasoning Chatbot Code

Post author By praison
Post date January 30, 2025

Install

pip install -U ollama chainlit streamlit gradio

Basic

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
)

# Access message content directly from response
response = completion['message']['content']

print(response)

Streaming

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
    stream=True  # Enable streaming
)

# Print the response as it comes in
for chunk in completion:
    if 'message' in chunk and 'content' in chunk['message']:
        content = chunk['message']['content']
        print(content, end='', flush=True)

Gradio

import ollama
import gradio as gr

def chat_with_ollama(message, history):
    # Initialize empty string for streaming response
    response = ""
    
    # Convert history to messages format
    messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]
    
    # Add history messages
    for h in history:
        messages.append({"role": "user", "content": h[0]})
        if h[1]:  # Only add assistant message if it exists
            messages.append({"role": "assistant", "content": h[1]})
    
    # Add current message
    messages.append({"role": "user", "content": message})
    
    completion = ollama.chat(
        model="deepseek-r1:latest",
        messages=messages,
        stream=True  # Enable streaming
    )
    
    # Stream the response
    for chunk in completion:
        if 'message' in chunk and 'content' in chunk['message']:
            content = chunk['message']['content']
            # Handle <think> and </think> tags
            content = content.replace("<think>", "Thinking...").replace("</think>", "\n\n Answer:")
            response += content
            yield response

# Create Gradio interface with Chatbot
with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox(placeholder="Enter your message here...")
    clear = gr.Button("Clear")

    def user(user_message, history):
        return "", history + [[user_message, None]]

    def bot(history):
        history[-1][1] = ""
        for chunk in chat_with_ollama(history[-1][0], history[:-1]):
            history[-1][1] = chunk
            yield history

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        bot, chatbot, chatbot
    )
    clear.click(lambda: None, None, chatbot, queue=False)

if __name__ == "__main__":
    demo.launch()

Streamlit

import streamlit as st
import ollama

# Set page title
st.title("Chat with Ollama")

# Initialize chat history in session state if it doesn't exist
if "messages" not in st.session_state:
    st.session_state.messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]

# Display chat input
user_input = st.chat_input("Your message:")

# Display chat history and handle new inputs
for message in st.session_state.messages:
    if message["role"] != "system":
        with st.chat_message(message["role"]):
            st.write(message["content"])

if user_input:
    # Display user message
    with st.chat_message("user"):
        st.write(user_input)
    
    # Add user message to history
    st.session_state.messages.append({"role": "user", "content": user_input})
    
    # Get streaming response
    with st.chat_message("assistant"):
        message_placeholder = st.empty()
        full_response = ""
        
        completion = ollama.chat(
            model="deepseek-r1:latest",
            messages=st.session_state.messages,
            stream=True
        )
        
        # Process the streaming response
        for chunk in completion:
            if 'message' in chunk and 'content' in chunk['message']:
                content = chunk['message']['content']
                full_response += content
                message_placeholder.write(full_response + "▌")
        
        message_placeholder.write(full_response)
    
    # Add assistant response to history
    st.session_state.messages.append({"role": "assistant", "content": full_response})

Chainlit

import chainlit as cl
import ollama
import json

@cl.on_message
async def main(message: cl.Message):
    # Create a message dictionary instead of using Message objects directly
    messages = [{'role': 'user', 'content': str(message.content)}]
    
    # Create a message first
    msg = cl.Message(content="")
    await msg.send()

    # Create a stream with ollama
    stream = ollama.chat(
        model='deepseek-r1:latest',  # Use a model you have installed
        messages=messages,
        stream=True,
    )

    # Stream the response token by token
    for chunk in stream:
        if token := chunk['message']['content']:
            await msg.stream_token(token)
    
    # Update the message one final time
    await msg.update()

@cl.on_chat_start
async def start():
    await cl.Message(content="Hello! How can I help you today?").send()

Praison AI

Mistral Codestral Chat Locally

Post author By praison
Post date January 15, 2025

export OPENAI_API_BASE=https://codestral.mistral.ai/v1
export OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxx
pip install "praisonai[chat]"
praisonai chat

Open: http://localhost:8084/

Set Model name : openai/codestral-latest

Praison AI

Agentic Routing

Post author By praison
Post date January 11, 2025

export OPENAI_API_KEY=xxxxx
pip install praisonaiagents

from praisonaiagents.agent import Agent
from praisonaiagents.task import Task
from praisonaiagents.agents import PraisonAIAgents
import time

def get_time_check():
    current_time = int(time.time())
    result = "even" if current_time % 2 == 0 else "odd"
    print(f"Time check: {current_time} is {result}")
    return result

# Create specialized agents
router = Agent(
    name="Router",
    role="Input Router",
    goal="Evaluate input and determine routing path",
    instructions="Analyze input and decide whether to proceed or exit",
    tools=[get_time_check]
)

processor1 = Agent(
    name="Processor 1",
    role="Secondary Processor",
    goal="Process valid inputs that passed initial check",
    instructions="Process data that passed the routing check"
)

processor2 = Agent(
    name="Processor 2",
    role="Final Processor",
    goal="Perform final processing on validated data",
    instructions="Generate final output for processed data"
)

# Create tasks with routing logic
routing_task = Task(
    name="initial_routing",
    description="check the time and return according to what is returned",
    expected_output="pass or fail based on what is returned",
    agent=router,
    is_start=True,
    task_type="decision",
    condition={
        "pass": ["process_valid"],
        "fail": ["process_invalid"]
    }
)

processing_task = Task(
    name="process_valid",
    description="Process validated input",
    expected_output="Processed data ready for final step",
    agent=processor1,
)

final_task = Task(
    name="process_invalid",
    description="Generate final output",
    expected_output="Final processed result",
    agent=processor2
)

# Create and run workflow
workflow = PraisonAIAgents(
    agents=[router, processor1, processor2],
    tasks=[routing_task, processing_task, final_task],
    process="workflow",
    verbose=True
)

print("\nStarting Routing Workflow...")
print("=" * 50)

results = workflow.start()

print("\nWorkflow Results:")
print("=" * 50)
for task_id, result in results["task_results"].items():
    if result:
        task_name = result.description
        print(f"\nTask: {task_name}")
        print(f"Result: {result.raw}")
        print("-" * 50)

Finetuning

Build Reasoning Models

Post author By praison
Post date January 11, 2025

1️⃣ Start with an LLM and fine-tune it on instruction reasoning data, e.g. (like MATH and Big MATH)

2️⃣ Implement search algorithms (MCTS or A*) to generate synthetic reasoning paths and capture intermediate steps

3️⃣ Train a Process Reward Model (PRM) to evaluate the quality of reasoning steps

4️⃣ Combine PRM rewards with outcome rewards from additional Outcome Reward Models or verifiable rewards (, e.g., correct result)

5️⃣ Add implicit and explicit backtracking and “verifications” to data to teach the model to self-correct and try different approaches

6️⃣ Use online RL methods like PPO, GRPO, RLOO…

Implementation:

Define the reasoning task (e.g., math problem-solving) and provide the question and correct answer. This serves as the ground truth for the MCTS process.
Sample multiple Chain-of-Thought (CoT) reasoning steps for each problem. (Rollouts)
Compare the final answer to the ground truth and select the incorrect ones.
Locate the error using a binary search, where reasoning is split in the middle. Then, take the first half and sample multiple times. If one of them is correct, the error might not be in the first half. Then, take the second half and split again; take the first half of the second split and redo it until you find the reasoning step with the error.
Label and calculate Monte Carlo estimations for each step (node).
Train the Process Reward Model (PRM) with pointwise soft label training objective using the Monte Carlo estimation of balanced data.

Source: https://huggingface.co/papers/2406.06592