🔒
Es gibt neue verfügbare Artikel. Klicken Sie, um die Seite zu aktualisieren.
Ältere BeiträgeHaupt-Feeds

I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys

20. Oktober 2025 um 05:31
I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys

When I started experimenting with AI integrations, I wanted to create a chat assistant on my website, something that could talk like GPT-4, reason like Claude, and even joke like Grok.

But OpenAI, Anthropic, Google, and xAI all require API keys. That means I needed to set up an account for each of the platforms and upgrade to one of their paid plans before I could start coding. Why? Because most of these LLM providers require a paid plan for API access. Not to mention, I would need to cover API usage billing for each LLM platform.

What if I could tell you there's an easier approach to start integrating AI within your websites and mobile applications, even without requiring API keys at all? Sounds exciting? Let me share how I did exactly that.

Integrate AI with Puter.js 

Thanks to Puter.js, an open source JavaScript library that lets you use cloud features like AI models, storage, databases, user auth, all from the client side. No servers, no API keys, no backend setup needed here. What else can you ask for as a developer?

Puter.js is built around Puter’s decentralized cloud platform, which handles all the stuff like key management, routing, usage limits, and billing. Everything’s abstracted away so cleanly that, from your side, it feels like authentication, AI, and LLM just live in your browser.

Enough talking, let’s see how you can add GPT-5 integration within your web application in less than 10 lines.

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(`What is puter js?`, {
            model: 'gpt-5-nano',
        }).then(puter.print);
    </script>
</body>
</html>

Yes, that’s it. Unbelievable, right? Let's save the HTML code into an index.html file place this a new, empty directory. Open a terminal and switch to the directory where index.html file is located and serve it on localhost with the Python command:

python -m http.server

Then open http://localhost:8000 in your web browser. Click on Puter.js “Continue” button when presented.

I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys
Integrate ChatGPT with Puter JS

🚧 It would take some time before you see a response from ChatGPT. Till then, you'll see a blank page.

I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys
ChatGPT Nano doesn't know Puter.js yet but it will, soon

You can explore a lot of examples and get an idea of what Puter.js does for you on its playground.

Let’s modify the code to make it more interesting this time. It would take a user query and return streaming responses from three different LLMs so that users can decide which among the three provides the best result. 

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>AI Model Comparison</title>
    <script src="https://cdn.twind.style"></script>
    <script src="https://js.puter.com/v2/"></script>
</head>
<body class="bg-gray-900 min-h-screen p-6">
    <div class="max-w-7xl mx-auto">
        <h1 class="text-3xl font-bold text-white mb-6 text-center">AI Model Comparison</h1>
        
        <div class="mb-6">
            <label for="queryInput" class="block text-white mb-2 font-medium">Enter your query:</label>
            <div class="flex gap-2">
                <input
                    type="text"
                    id="queryInput"
                    class="flex-1 px-4 py-3 rounded-lg bg-gray-800 text-white border border-gray-700 focus:outline-none focus:border-blue-500"
                    placeholder="Write a detailed essay on the impact of artificial intelligence on society"
                    value="Write a detailed essay on the impact of artificial intelligence on society"
                />
                <button
                    id="submitBtn"
                    class="px-6 py-3 bg-blue-600 hover:bg-blue-700 text-white rounded-lg font-medium transition-colors"
                >
                    Generate
                </button>
            </div>
        </div>

        <div class="grid grid-cols-1 md:grid-cols-3 gap-4">
            <div class="bg-gray-800 rounded-lg p-4">
                <h2 class="text-xl font-semibold text-blue-400 mb-3">Claude Opus 4</h2>
                <div id="output1" class="text-gray-300 text-sm leading-relaxed h-96 overflow-y-auto whitespace-pre-wrap"></div>
            </div>
            
            <div class="bg-gray-800 rounded-lg p-4">
                <h2 class="text-xl font-semibold text-green-400 mb-3">Claude Sonnet 4</h2>
                <div id="output2" class="text-gray-300 text-sm leading-relaxed h-96 overflow-y-auto whitespace-pre-wrap"></div>
            </div>
            
            <div class="bg-gray-800 rounded-lg p-4">
                <h2 class="text-xl font-semibold text-purple-400 mb-3">Gemini 2.0 Pro</h2>
                <div id="output3" class="text-gray-300 text-sm leading-relaxed h-96 overflow-y-auto whitespace-pre-wrap"></div>
            </div>
        </div>
    </div>

    <script>
        const queryInput = document.getElementById('queryInput');
        const submitBtn = document.getElementById('submitBtn');
        const output1 = document.getElementById('output1');
        const output2 = document.getElementById('output2');
        const output3 = document.getElementById('output3');

        async function generateResponse(query, model, outputElement) {
            outputElement.textContent = 'Loading...';
            
            try {
                const response = await puter.ai.chat(query, {
                    model: model,
                    stream: true
                });
                
                outputElement.textContent = '';
                
                for await (const part of response) {
                    if (part?.text) {
                        outputElement.textContent += part.text;
                        outputElement.scrollTop = outputElement.scrollHeight;
                    }
                }
            } catch (error) {
                outputElement.textContent = `Error: ${error.message}`;
            }
        }

        async function handleSubmit() {
            const query = queryInput.value.trim();
            
            if (!query) {
                alert('Please enter a query');
                return;
            }

            submitBtn.disabled = true;
            submitBtn.textContent = 'Generating...';
            submitBtn.classList.add('opacity-50', 'cursor-not-allowed');

            await Promise.all([
                generateResponse(query, 'claude-opus-4', output1),
                generateResponse(query, 'claude-sonnet-4', output2),
                generateResponse(query, 'google/gemini-2.0-flash-lite-001', output3)
            ]);

            submitBtn.disabled = false;
            submitBtn.textContent = 'Generate';
            submitBtn.classList.remove('opacity-50', 'cursor-not-allowed');
        }

        submitBtn.addEventListener('click', handleSubmit);
        
        queryInput.addEventListener('keypress', (e) => {
            if (e.key === 'Enter') {
                handleSubmit();
            }
        });
    </script>
</body>
</html>

Save the above file in the index.html file as we did in the previos example and then run the server with Python. This is what it looks like now on localhost.

I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys
Comparing output from different LLM provider with Puter.js

And here is a sample response from all three models on the query "What is It's FOSS".

I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys

Looks like It's FOSS is well trusted by humans as well as AI 😉

My Final Take on Puter.js and LLMs Integration

That’s not bad! Without requiring any API keys, you can do this crazy stuff.

Puter.js utilizes the “User pays model” which means it’s completely free for developers, and your application user will spend credits from their Puter’s account for the cloud features like the storage and LLMs they will be using. I reached out to them to understand their pricing structure, but at this moment, the team behind it is still working out to come up with a pricing plan. 

This new Puter.js library is superbly underrated. I’m still amazed by how easy it has made LLM integration. Besides it, you can use Puter.js SDK for authentication, storage like Firebase.

Do check out this wonderful open source JavaScript library and explore what else you can build with it.

Puter.js - Free, Serverless, Cloud and AI in One Simple Library
Puter.js provides auth, cloud storage, database, GPT-4o, o1, o3-mini, Claude 3.7 Sonnet, DALL-E 3, and more, all through a single JavaScript library. No backend. No servers. No configuration.
I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API KeysPuter
I Used This Open Source Library to Integrate OpenAI, Claude, Gemini to Websites Without API Keys

I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

11. Oktober 2025 um 04:26
I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

My interest in running AI models locally started as a side project with part curiosity and part irritation with cloud limits. There’s something satisfying about running everything on your own box. No API quotas, no censorship, no signups. That’s what pulled me toward local inference.

My struggle with running local AI models

My setup, being an AMD GPU on Windows, turned out to be the worst combination for most local AI stacks.

The majority of AI stacks assume NVIDIA + CUDA, and if you don’t have that, you’re basically on your own. ROCm, AMD’s so-called CUDA alternative, doesn’t even work on Windows, and even on Linux, it’s not straightforward. You end up stuck with CPU-only inference or inconsistent OpenCL backends that feel like a decade behind.

Why not Ollama and LM Studio?

I started with the usual tools, i.e., Ollama and LM Studio. Both deserve credit for making local AI look plug-and-play. I tried LM Studio first. But soon after, I discovered how LM Studio hijacks my taskbar. I frequently jump from one application window to another using the mouse, and it was getting annoying for me. Another thing that annoyed me is its installer size of 528 MB. 

I’m a big advocate for keeping things minimal yet functional. I’m a big admirer of a functional text editor that fits under 1 MB (Dred), a reactive JavaScript library and React alternative that fits under 1KB (Van JS), and a game engine that fits under 100 MB (Godot).

Then I tried Ollama. Being a CLI user (even on Windows), I was impressed with Ollama. I don’t need to spin up an Electron JS application (LM Studio) to run an AI model locally.

With just two commands, you can run any AI models locally with Ollama.

ollma pull tinyllama
ollama run tinyllama 
I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

But once I started testing different AI models, I needed to reclaim disk space after that. My initial approach was to delete the model manually from File Explorer. I was a bit paranoid! But soon, I discovered these Ollama commands:

ollama rm tinyllama     #remove the model
ollama ls               #lists all models

Upon checking how lightweight Ollama is, it comes close to 4.6 GB on my Windows system. Although you can delete unnecessary files to make it slim (it comes bundled with all libraries like rocm, cuda_v13, and cuda_v12), 

After trying Ollama, I was curious! Does LM Studio even provide a CLI? Upon my research, I came to know, yeah, it does offer a command lineinterface. I investigated further and found out that LM Studio uses Llama.cpp under the hood.

With these two commands, I can run LM Studio via CLI and chat to an AI model while staying in the terminal:

lms load <model name>   #Load the model
lms chat                #starts the interactive chat
I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

I was generally satisfied with LM Studio CLI at this moment. Also, I noticed it came with Vulkan support out of the box. Now, I have been looking to add Vulkan support for Ollama. I discovered an approach to compile Ollama from source code and enable Vulkan support manually. That’s a real hassle!

I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

I just had three additional complaints at this moment. Every time I needed to use LM Studio CLI(lms), it would take some time to wake up its Windows service. LMS CLI is not feature-rich. It does not even provide a CLI way to delete a model. And the last one was how it takes two steps to load the model first and then chat. 

After the chat is over, you need to manually unload the model. This mental model doesn’t make sense to me. 

That’s where I started looking for something more open, something that actually respected the hardware I had. That’s when I stumbled onto Llama.cpp, with its Vulkan backend and refreshingly simple approach. 

Setting up Llama.cpp

🚧
The tutorial was performed on Windows because that's the system I am using currently. I understand that most folks here on It's FOSS are Linux users and I am committing blasphemy of sort but I just wanted to share the knowledge and experience I gained with my local AI setup. You could actually try similar setup on Linux, too. Just use Linux equivalent paths and commands.

Step 1: Download from GitHub

Head over to its GitHub releases page and download its latest releases for your platform.

📋
If you’ll be using Vulkan support, remember to download assets suffixed with vulkan-x64.zip like llama-b6710-bin-ubuntu-vulkan-x64.zip, llama-b6710-bin-win-vulkan-x64.zip.

Step 2: Extract the zip file

Extract the downloaded zip file and, optionally, move the directory where you usually keep your binaries, like /usr/local/bin on macOS and Linux. On Windows 10, I usually keep it under %USERPROFILE%\.local/bin.

Step 3: Add the Llama.cpp directory to the PATH environment variable

Now, you need to add its directory location to the PATH environment variable. 

On Linux and macOS (replace path-to-llama-cpp-directory with your exact directory location):

export PATH=$PATH:”<path-to-llama-cpp-directory>”

On Windows 10 and Windows 11:

setx PATH=%PATH%;:”<path-to-llama-cpp-directory>”

Now, Llama.cpp is ready to use.

llama.cpp: The best local AI stack for me

Just grab a .gguf file, point to it, and run. It reminded me why I love tinkering on Linux in the first place: fewer black boxes, more freedom to make things work your way.

With just one command, you can start a chat session with Llama.cpp:

llama-cli.exe -m e:\models\Qwen3-8B-Q4_K_M.gguf --interactive
I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

If you carefully read its verbose message, it clearly shows signs of GPU being utilized:

I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

With llama-server, you can even download AI models from Hugging Face, like:

llama-server -hf itlwas/Phi-4-mini-instruct-Q4_K_M-GGUF:Q4_K_M

-hf flag tells to download the model from the Hugging Face repository.

You even get a web UI with Llama.cpp. Like run the model with this command:

llama-server -m e:\models\Qwen3-8B-Q4_K_M.gguf --port 8080 --host 127.0.0.1

This starts a web UI on http://127.0.0.1:8080, along with the ability to send an API request from another application to Llama.

I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

Let’s send an API request via curl:

curl http://127.0.0.1:8080/completion -H "Content-Type: application/json" -d "{\"prompt\":\"Explain the difference between OpenCL and SYCL in short.\",\"temperature\":0.7,\"max_tokens\":128}
  • temperature controls the creativity of the model’s output
  • max_tokens controls whether the output will be short and concise or a paragraph-length explanation.
I Switched From Ollama And LM Studio To llama.cpp And Absolutely Loving It

llama.cpp for the win

What am I losing by using llama? Nothing. Like Ollama, I can use a feature-rich CLI, plus Vulkan support. All comes under 90 MB on my Windows 10 system.

Now, I don’t see the point of using Ollama and LM Studio, I can directly download any model with llama-server, run the model directly with llama-cli, and even interact with its web UI and API requests. 

I’m hoping to do some benchmarking on how performant AI inference on Vulkan is as compared to pure CPU and SYCL implementation in some future post. Until then, keep exploring AI tools and the ecosystem to make your life easier. Use AI to your advantage rather than going on endless debate with questions like, will AI take our jobs?

I Ran Local LLMs on My Android Phone

15. September 2025 um 13:58
Von: Community
I Ran Local LLMs on My Android Phone

Like it or not, AI is here to stay. For those who are concerned about data privacy, there are several local AI options available. Tools like Ollama and LM Studio makes things easier.

Now those options are for the desktop user and require significant computing power.

What if you want to use the local AI on your smartphone? Sure, one way would be to deploy Ollama with a web GUI on your server and access it from your phone.

But there is another way and that is to use an application that lets you install and use LLMs (or should I say SLMs, Small Language Models) on your phone directly instead of relying on your local AI server on another computer.

Allow me to share my experience with experimenting with LLMs on a phone.

📋
Smartphones these days have powerful processors and some even have dedicated AI processors on board. Snapdragon 8 Gen 3, Apple’s A17 Pro, and Google Tensor G4 are some of them. Yet, the models that can be run on a phone are often vastly different than the ones you use on a proper desktop or server.

Here's what you'll need:

  • An app that allows you to download the language models and interact with them.
  • Suitable LLMs that have been specifically created for running on mobile devices.

Apps for running LLMs locally on a smartphone

After researching, I decided to explore following applications for this purpose. Let me share their features and details.

1. MLC Chat

MLC Chat supports top models like Llama 3.2, Gemma 2, phi 3.5 and Qwen 2.5 offering offline chat, translation, and multimodal tasks through a sleek interface. Its plug-and-play setup with pre-configured models, NPU optimization (e.g., Snapdragon 8 Gen 2+), and beginner-friendly features make it a good choice for on-device AI. 

You can download the MLC Chat APK from their GitHub release page.

Android is looking to forbid sideloading of APK files. I don't know what would happen then, but you can use APK files for now.

Put the APK file on your Android device, go into Files and tap the APK file to begin installation. Enable “Install from Unknown Sources” in your device settings if prompted. Follow on-screen instructions to complete the installation.

I Ran Local LLMs on My Android Phone
Enable APK installation

Once installed, open the MLC Chat app, select a model from the list, like Phi-2, Gemma 2B, Llama-3 8B, Mistral 7B. Tap the download icon to install the model. I recommend opting for smaller models like Phi-2. Models are downloaded on first use and cached locally for offline use.

I Ran Local LLMs on My Android Phone
Click on the download button to download a model

Tap the Chat icon next to the downloaded model. Start typing prompts to interact with the LLM offline. Use the reset icon to start a new conversation if needed.

I Ran Local LLMs on My Android Phone

2. SmolChat (Android)

SmolChat is an open-source Android app that runs any GGUF-format model (like Llama 3.2, Gemma 3n, or TinyLlama) directly on your device, offering a clean, ChatGPT-like interface for fully offline chatting, summarization, rewriting, and more.

Install SmolChat from Google's Play Store. Open the app, choose a GGUF model from the app’s model list or manually download one from Hugging Face. If manually downloading, place the model file in the app’s designated storage directory (check app settings for the path).

I Ran Local LLMs on My Android Phone
I Ran Local LLMs on My Android Phone
I Ran Local LLMs on My Android Phone

3. Google AI Edge Gallery

Google AI Edge Gallery is an experimental open-source Android app (iOS soon) that brings Google's on-device AI power to your phone, letting you run powerful models like Gemma 3n and other Hugging Face models fully offline after download. This application makes use of Google’s LiteRT framework.

You can download it from Google Play Store. Open the app and browse the list of provided models or manually download a compatible model from Hugging Face.

Select the downloaded model and start a chat session. Enter text prompts or upload images (if supported by the model) to interact locally. Explore features like prompt discovery or vision-based queries if available.

I Ran Local LLMs on My Android Phone
I Ran Local LLMs on My Android Phone
I Ran Local LLMs on My Android Phone

Top Mobile LLMs to try out

Here are the best ones I’ve used:

Model My Experience Best For
Google’s Gemma 3n (2B) Blazing-fast for multimodal tasks including image captions, translations, even solving math problems from photos. Quick, visual-based AI assistance
Meta’s Llama 3.2 (1B/3B) Strikes the perfect balance between size and smarts. It’s great for coding help and private chats.The 1B version runs smoothly even on mid-range phones. Developers & privacy-conscious users
Microsoft’s Phi-3 Mini (3.8B) Shockingly good at summarizing long documents despite its small size. Students, researchers, or anyone drowning in PDFs
Alibaba’s Qwen-2.5 (1.8B) Surprisingly strong at visual question answering—ask it about an image, and it actually understands! Multimodal experiments
TinyLlama-1.1B The lightweight champ runs on almost any device without breaking a sweat. Older phones or users who just need a simple chatbot

All these models use aggressive quantization (GGUF/safetensors formats), so they’re tiny but still powerful. You can grab them from Hugging Face—just download, load into an app, and you’re set.

Challenges I faced while running LLMs Locally on Android smartphone

Getting large language models (LLMs) to run smoothly on my phone has been equally exhilarating and frustrating.

On my Snapdragon 8 Gen 2 phone, models like Llama 3-4B run at a decent 8-10 tokens per second, which is usable for quick queries. But when I tried the same on my backup Galaxy A54 (6 GB RAM), it choked. Loading even a 2B model pushed the device to its limits. I quickly learned that Phi-3-mini (3.8B) or Gemma 2B are far more practical for mid-range hardware.

The first time I ran a local AI session, I was shocked to see 50% battery gone in under 90 minutes. MLC Chat offers power-saving mode for this purpose. Turning off background apps to free up RAM also helps.

I also experimented with 4-bit quantized models (like Qwen-1.5-2B-Q4) to save storage but noticed they struggle with complex reasoning. For medical or legal queries, I had to switch back to 8-bit versions. It was slower but far more reliable.

Conclusion

I love the idea of having an AI assistant that works exclusively for me, no monthly fees, no data leaks. Need a translator in a remote village? A virtual assistant on a long flight? A private brainstorming partner for sensitive ideas? Your phone becomes all of these staying offline and untraceable.

I won’t lie, it’s not perfect. Your phone isn’t a data center, so you’ll face challenges like battery drain and occasional overheating. But it also provides tradeoffs like total privacy, zero costs, and offline access.

The future of AI isn’t just in the cloud, it’s also on your device.

Author Info

I Ran Local LLMs on My Android Phone

Bhuwan Mishra is a Fullstack developer, with Python and Go as his tools of choice. He takes pride in building and securing web applications, APIs, and CI/CD pipelines, as well as tuning servers for optimal performance. He also has passion for working with Kubernetes.

5 Local AI Tools to Interact With PDF and Documents

23. Dezember 2024 um 09:24
5 Local AI Tools to Interact With PDF and Documents

We’ve covered a lot of local LLMs on It's FOSS. You can use them as coding assistants or run them on your tiny Raspberry Pi setups.

But recently, I’ve noticed many comments asking about local AI tools to interact with PDFs and documents.

Now, during my research, I stumbled upon countless AI-powered websites that promise to summarize, query, or analyze PDFs.

Some were sleek and polished but unsurprisingly, most were paid or had limited “free tier” options. And let’s be honest, when you’re uploading documents to a cloud service, there’s no real guarantee of privacy.

That’s why I’ve put together this list of open-source AI projects that let you interact with PDFs locally. These tools enable you to have your data stay on your machine, offline, and under your control.

Whether you’re summarizing long research papers, extracting key insights, or just searching for specific details, these tools will have your back.

Let’s dive in!

1. Chatd

chatd is a desktop application that allows you to chat with your documents locally using a large language model.

Unlike other tools, chatd comes with a built-in LLM runner, so you don’t need to install anything extra, just download, unzip, and run the executable.

5 Local AI Tools to Interact With PDF and Documents

Key features:

  • All your data stays on your computer and is never sent to the cloud.
  • Comes pre-packaged with Ollama, a local LLM server that manages the language model for you. If you already have Ollama running, chatd will automatically use it.
  • Works seamlessly on Windows, macOS, and Linux.
  • Advanced users can enable GPU support or select a custom LLM.

2. localGPT

LocalGPT is an open-source solution that enables you to securely interact with your documents locally.

Built for ultimate privacy, LocalGPT ensures that no data ever leaves your computer, making it a perfect fit for privacy-conscious users.

5 Local AI Tools to Interact With PDF and Documents

Key features:

  • All processing happens on your machine, ensuring no external data leaks.
  • Integrates seamlessly with popular open-source models like HF (HuggingFace), GPTQ, GGML, and GGUF.
  • Uses LangChain and ChromaDB to run a fully local Retrieval-Augmented Generation (RAG) pipeline.
  • Comes with two GUIs, one API-based and the other standalone using Streamlit.
  • Optional session-based history to remember your previous questions.
  • Supported File Formats: PDFs, TXT, CSV, DOCX, Markdown, and more. You can add custom loaders via LangChain.

3. PrivateGPT

PrivateGPT is a production-ready, privacy-focused AI project that enables you to interact with your documents using Large Language Models (LLMs), completely offline.

No data ever leaves your local environment, making it ideal for privacy-sensitive industries like healthcare, legal, or finance.

Having personally used this project, I highly recommend it for its privacy and performance once set up.

5 Local AI Tools to Interact With PDF and Documents

Key features:

  • 100% offline, no internet connection required.
  • Built on a robust Retrieval-Augmented Generation pipeline.
  • Offers OpenAI-compatible APIs for building private, context-aware AI applications.
  • Includes a user-friendly interface (Gradio UI) to interact with your documents.
  • Uses LlamaIndex for document ingestion and RAG pipelines and FastAPI, making it extensible and easy to integrate.
  • Provides tools for advanced users to customize embedding generation and document chunk retrieval.
Setting Up PrivateGPT to Use AI Chat With Your Documents
Set up the PrivateGPT AI tool and interact or summarize your documents with full control on your data.
5 Local AI Tools to Interact With PDF and DocumentsIt's FOSSAbhishek Kumar
5 Local AI Tools to Interact With PDF and Documents

4. GPT4All

GPT4All is another open-source project that enables you to run large language models (LLMs) offline on everyday desktops or laptops, no internet, API calls, or GPUs required.

The application is designed to run smoothly on a variety of systems. It's perfect for privacy-conscious users who want local AI capabilities to interact with documents or chat seamlessly.

5 Local AI Tools to Interact With PDF and Documents

Key features:

  • Run LLMs locally without the need for cloud-based API calls.
  • Works entirely offline, ensuring privacy and control over your data.
  • Download and install the application on Windows, macOS, or Linux to get started immediately.
  • GPT4All offers a Python client for integrating LLMs into your own applications.
  • The LocalDocs feature allows you to privately chat with your documents, offering a secure way to interact with local data.
  • Can be integrate with Langchain for enhanced functionality and access to external databases such as Weaviate.

5. LM Studio (Editor's Choice ⭐)

LM Studio has become my go-to tool for daily use, and it’s easily my favorite project in this space.

With the latest release (version 0.3), it introduced the ability to chat with your documents, a beta feature that has worked exceptionally well for me so far.

5 Local AI Tools to Interact With PDF and Documents

Key features:

  • LM Studio lets you download LLMs directly from Hugging Face using its in-app browser.
  • Use a simple, user-friendly interface to chat with AI models for tasks like answering questions, generating text, or analyzing content.
  • Introduced in version 0.3, you can now upload documents and interact with them locally (still in beta).
  • It works as a local server, allowing seamless integration of AI models into your projects without relying on third-party services.
  • On-demand model loading helps optimize system resources by loading models only when needed.
  • Explore trending and noteworthy LLMs in the app’s Discover page.
  • It also supports vision-enabled AI capabilities with MistralAI’s Pixtral models for advanced applications.
  • Available for macOS, Windows, and Linux and Apple Silicon Macs.
Using LM Studio to Run LLMs Easily, Locally and Privately
LM Studio makes it easier to find and install LLMs locally. You can also interact with them in the same neat graphical user interface.
5 Local AI Tools to Interact With PDF and DocumentsIt's FOSSAbhishek Kumar
5 Local AI Tools to Interact With PDF and Documents

Wrapping up

Personally, I use LM Studio daily. As a university student, reading through PDFs day in and day out can be quite tiresome. That's why I like to fiddle around with such projects and look for what best suits my workflow.

I started with PrivateGPT, but once I tried LM Studio, I instantly fell in love with its clean UI and the ease of downloading models.

While I’ve also experimented with Ollama paired with Open WebUI, which worked well, LM Studio has truly become my go-to tool for handling documents efficiently.

These are some of the projects I recommend for interacting with or chatting with PDF documents. However, if you know of more tools that offer similar functionality, feel free to comment below and share them with the community!

  • Es gibt keine weiteren Artikel
❌