Jan AI and local AI Systems

I’m been very interested in AI technology since even before ChatGPT was a thing. I have a lot of different AI technologies I play around with, the main two being ChatGPT and Gemini. I use Claude AI sometimes as well. Let me explain a few things.

ChatGPT: This works great for me, for almost everything. (Especially when I have the pro plan)

Gemini: The only thing Gemini is great for (compared to ChatGPT) is less censorship. As a writer I sometimes want to research and brainstorm topics that will get me flagged fo content violations on ChatGPT, Gemini knowing that I’m a writer tends to be a lot more lenient.

Grok: If I want completely uncensored

ClaudeAI: Great to play around with and if ChatGPT and Gemini didn’t exist, it would be the main one i use. I love it, but I only have so much time to use them.

With all that being said, I recently started digging more into Python programming. Because of that I started gaining a lot of interest in trying to do LOCAL LLM models. I started my journey by installing Transformers, and Hugging face-related libraries. With some work I was able to download and run almost any model off Hugging Face that didn’t require much work. I started playing around with both chat related, and Image creation models.

Then an associate of mine mentioned “Jan AI” so I downloaded that and started playing with it.

What is Jan AI?

For those new to this, Jan AI is a fantastic desktop application that allows you to run Large Language Models (LLMs) locally on your computer. Think of it as your personal AI sandbox where you can experiment with different models without being tethered to an internet connection or relying on external servers. This is a game-changer for privacy, customization, and control. It’s freely available for Windows, Mac, and Linux.

Local vs. Cloud Models: A Tale of Two Worlds

The AI world is largely divided into two camps: cloud-based models and local models. Understanding the distinction is key to navigating this space.

Cloud Models: These are the AI giants hosted on powerful servers by companies like OpenAI (ChatGPT), Google (Gemini), and Anthropic (Claude).
Local Models: These are the models you download and run directly on your own machine, like the ones you use with Jan AI. The good thing about these is they are largely uncensored, except by the models themselves (based off their training) plus they are free to use, faster response times, an da lot more customizations can be made.

After letting my OCD get carried away and downloading every single one of them, then with some testing and feedback I uninstalled the ones I didn’t want, and now I’m using specific ones that I think cover all my heavily experimental needs.

Below are the models that I decided to keep installed, at least for the time being, and a little information about them.

General Purpose/Chat:

Aya 23 8B Q4 (aya-23-8b): A large, unique model, potentially strong in multilingual tasks. (4.71GB)
Gemma 2 9B Q4 (gemma-2-9b-it): A large, powerful general-purpose model from Google. (5.36GB)
Gemma 1.1 2B Q4 (gemma-1.1-2b-it): A small and fast Gemma model for quicker tasks. (1.52GB)
Llama 3 8B Instruct Q4 (llama3-8b-instruct): My most powerful Llama model, good for a wide range of tasks. (4.58GB)
Llama 3.2 3B Instruct Q8 (llama3.2-3b-instruct): A smaller Llama 3 with less compression, offering a balance of speed and quality. (3.19GB)
LlamaCorn 1.1B Q8 (llamacorn-1.1b): Extremely small and fast Llama, ideal for simple tasks. (1.09GB)
Mistral 7B Instruct Q4 (mistral-ins-7b-q4): A solid performer known for its efficiency. (4.07GB)
Phi-3 Medium Instruct Q4 (phi3-medium): A good mid-sized model, larger than the Phi-3 Mini. (7.79GB)
Qwen2.5 14B Instruct Q4 (qwen2.5-14b-instruct): A large, powerful Qwen model for complex tasks. (8.37GB)
Qwen 2 7B Instruct Q4 (qwen2-7b): A smaller, faster Qwen model. (4.36GB)

Coding:

Codestral 22B Q4 (codestral-22b): My top-tier coding model, very large and powerful. (12.42GB)
Deepseek Coder 1.3B Instruct Q8 (deepseek-coder-1.3b): A small and fast coding model for quick tasks. (1.33GB)
Qwen2.5 Coder 14B Instruct Q4 (qwen2.5-coder-14b-instruct): Another large and specialized coding model. (8.37GB)

Vision/Multimodal:

BakLlava 1 (bakllava-1): A unique model, potentially with vision capabilities. (5.36GB)
LlaVa 13B Q4 (llava-13b): A large vision-language model for tasks involving images and text. (7.33GB)

Other:

Stable Zephyr 3B Q8 (stable-zephyr-3b): A model from Stability AI, relatively small but with Q8 quantization. (2.77GB)
TinyLlama Chat 1.1B Q4 (tinyllama-1.1b): An extremely lightweight model for when resource usage is critical. (638.01MB)

Some reasons I kept this specific selection

Diversity: I have a good mix of model families (Llama, Gemma, Qwen, Mistral, etc.), sizes, and specializations (general, coding, vision).
Performance: I kept the most powerful models in each category, ensuring I have the right tools for demanding tasks.
Efficiency: I included smaller, faster models for quick tasks or when I want to conserve resources.
Uncensored Potential: The inclusion of models like BakLlava 1, LlaVa 13b, and Deepseek Coder 1.3b gives me options for exploring less restricted content generation, which is valuable for my creative writing research.

These are the ones I found the most useful. Even though I’m still experimenting with it I wanted to share the website and some of my findings. See here for the link.

Be First to Comment

Leave a Reply Cancel reply