AI - Artificial Intelligence

Ollama Open WebUI + engine which is prompt-based, similar to ChatGPT, ask questions, get responses. It's completely local, it doesn't go to the internet.

Engine nodes run on GPUs.

The query response is very slow and prints a few words a second when using CPUs instead of GPUs.

Performance decline after consecutive questions.

Why does the performance degrade after one query?

Chat LLM Tips

Features & Differences

Different LLMs have different features and levels of sophistication in different areas.

It can be useful to give the same question to different LLMs and see how different their results are.

Few Shot Prompt

Give the prompt more context, instructions and examples to improve it's accuracy.

I often also ask more than one question to reduce the number of round trips and time, although these should be in the same context and topic to avoid reducing the model accuracy.

Reliability

LLMs are probabilistic and sometimes hallucinate wrong answers.

You should be double checking answers, but this leads to an obvious problem, what is the point of asking something when you are not confident of the answers?

This is safer for simple things which you can immediately verify yourself, such as some bit of knowledge you forgot but recognize when you see it, or where it is giving you citation source links to web pages that you can click through to double check.

Safety

It is not safe to just copy and paste code from AI - it is often not just sub-optimal but actually contains serious security vulnerabilities such as SQL Injection or Code Injection.

Only veteran senior engineers who already know all this stuff and can spot and correct such things should be using AI for coding.

You wouldn't for example let an AI handle anything important without knowing what you're doing in that field yourself so you can catch mistakes that could have serious real-life consequences.

Current Knowledge - Internet Search

Since pre-training is expensive and time consuming, LLMs knowledge is often a few months out of date.

More recent LLM models (ChatGPT, Grok, Perplexity) now detect and search the internet for knowledge not contained in them model in order to deliver answers on things that are more recent, such as new TV episodes or current events.

If the model cannot do an internet search for the the new information, it will usually tell you that it doesn't contain the knowledge you've asked since it was trained before the relevant date.

Models with internet search are more useful because you can use them to querying many recent web pages and summarize what you want to know very quickly.

It's also important when querying for things that might change, such as current events, such as asking about a given company or technology's capabilities, which could be added or updated any time, you want a model that can search and check the latest information on the internet for you.

Context Windows

Since LLMs are predicting the next word based on tokens, start a New Chat for a clean context when switching topics to improve the accuracy and speed of the response without the model getting distracted or confused by previous tokens in the context window.

Thinking vs Non-Thinking Models

Thinking models have better reasoning but they are slower and more expensive to run, and therefore usually behind paywall subscriptions in the services below.

So simple knowledge recall a non-thinking model with sufficient and faster.

Thinking models may give better results for complex problem solving, such as debugging code.

There is usually a drop down or button in the web page to switch between the different generations of models, some of which are thinking, some of which aren't, so you can tune which one is more appropriate to your use case.

Memory

You may have noticed that ChatGPT remembers things between chats.

It can save memory about you automatically.

You can also explicitly ask it to remember something important that might be useful to subsequent chats.

Custom Global Instructions

You can configure an LLM such as ChatGPT in Settings in what traits it should have - ie. how to behave, how to speak to you eg. be based, give me straightforward truthful answers without being politically correct and without preample, just get to the point and be concise.

You can give it some context on yourself, such as what is should know about you, for how to relate to you.

Custom GPT

If you find yourself using a prompt preamble a lot, you can save it as a custom GPT and that way you don't have to repeat the instructions for the context, you can just paste in the unique bit and have it give you the answer with all the instructions and context of how you want it to respond.

Think of this as a pre-loaded template prompt preamble stored under a new Custom GPT name.

Custom Translator

You can use the Custom GPT feature to create a translator that will break down the translation into it's components and which you can then pose follow up questions to help you learn a language.

This is much better than just using an old fashioned flat Google Translate or similar.

Tool Use

Some LLMs like ChatGPT and Claude recognize when to outsource to tools to get answers, such as automatically running a Python or Javascript interpreter and then feeding the result back.

A simple example is a basic multiplication that can be done via memory recall, like we humans do, versus a more complex maths calculation that it needs to use a calculator or programming interpreter like Python or Javascript to solve.

Audio Querying

Pro (paid-for) LLM versions can listen to audio and reply with audio.

Sci-fi is here (and no, Siri and Alexa previously really didn't count).

If you don't want to pay for a pro version you can use Whisper Apps to pre-load your audio into an LLM text box.

Image Querying

Upload and image, have it transcribe the text and then ask questions about this eg. nutritional label from a product.

Tip: first ask the LLM to transcribe it into text to ensure it is correctly "seeing" the same thing you are before you dive into asking questions.

In the free tier there is usually a very limited number of image uploads in a time period so use them sparingly.

Video Querying

You can feed video to AI and query on it, audio responses being most impressive.

At time of writing this is only available on the ChatGPT mobile app where you point your mobile camera to things for ChatGPT to see them, called Advanced Voice + Video.

Text to Speech

ElevenLabs
Google Notebook LM - upload a document and it can generate a podcast episode summarizing it

Speech to Text

Wordly.ai

https://attend.wordly.ai - I found out about this by a guy sitting next to me getting a real-time translation at BTC Prague 2025.

Otter.ai

https://otter.ai/

Proprietary subscription, not bothering with it, used OpenAI Whisper below for free instead.

Plaud.ai

https://www.plaud.ai/

Portable devices to record and transcript using AI.

Whisper Apps

SuperWhisper
WisprFlow
MacWhisper

Tip: bind a hotkey to record audio to text into an LLM box and just correct esoteric proper nouns that don't transcribe properly, then hit enter.

OpenAI Whisper

openai/whisper

OpenAI Whisper Install

Installs locally, downloads a model and runs on a local video or audio file.

Install on Mac:

brew install openai-whisper

or generic Python install

pip install openai-whisper

Also requires ffmpeg to be installed.

On Mac:

brew install ffmpeg

or on Debian / Ubuntu Linux:

sudo apt update &&
sudo apt install ffmpeg -y

OpenAI Whisper Run CLI

List of Available Models.

Run whisper, using the --turbo model (will take a while to download the model the first time):

whisper "$file" --turbo

Outputs the text transcript from the video or audio file to stdout, as well as creating .txt, .srt, .json, .tsv and .vtt formatted transcripts for further processing.

OpenAI Whisper Run from Python

import whisper

model = whisper.load_model("turbo")
result = model.transcribe("audio.mp3")
print(result["text"])

Grammar

Visual

Image

Image generation AIs - many LLMs can often do this from their prompts, sometimes by outsourcing to another more spefici AI to return the image result to you:

Grok
https://www.meta.ai/
OpenAI DALL-E 3

Video

Video generation AIs:

InVideo AI - generate high production quality videos from text prompts
LumaLabs Dream Machine - pics or video
Kawping
Canva
Sora by OpenAI
Veo-2 by Google

Hunyuan
Runway Gen-3
Alibaba Cloud Wanx
Pika
Luma Ray 2
Hailuo T2V-01

Presentation

Gamma

https://gamma.app/

Designs presentations, websites, social media posts etc.

UI

Uizard - https://app.uizard.io

App Generation

Idea to app in seconds.

Replit
Lovable
Bubble
Flutterflow
Relevance AI
GenSpark
Manus
MKStack - build Nostr apps from prompt

Coding

Cursor AI - separate Editor that requires download
- low usage quota in free tier - I used up my limit in 1 hour
- reads your files on disk to do work appropriate to your project
- calls APIs to use models including OpenAI's ChatGPT, Anthropic Opus, Google Gemini, Claude Sonnet and its own Composer 1 model
GitHub CoPilot
- CoPilot Chat Cookbook
TabNine - support for all major IDEs including my favourite IntelliJ, no longer a free tier
Agentic
Windsurf