Nvidia Releases Chat with RTX

Nvidia just released Chat with RTX, an open sourced local AI Chatbot for PCs Powered by Its Own GPUs: This is from Nvidia’s new technology demo called “Chat with RTX” that allows users to use open-source AI large-language models to interact with their local files and documents.

An AI chatbot that runs locally on your PC

Nvidia has released chat with RTX, in a tech demo they showed what allows users to personalize a chatbot with their own content, accelerated by a local NVIDIA GeForce RTX 30 Series GPU or higher with at least 8GB of video random access memory, or VRAM. The tool uses retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software, and NVIDIA RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs.

Users can connect local files on a PC as a dataset to an open-source large language model, enabling queries for quick, contextually relevant answers. The tool supports various file formats and allows users to include information from YouTube videos and playlists. Chat with RTX runs locally on Windows RTX PCs and workstations, providing fast results, and ensuring that the user’s data stays on the device. It requires a GeForce RTX 30 Series GPU or higher with a minimum 8GB of VRAM, Windows 10 or 11, and the latest NVIDIA GPU drivers. The app is built from the TensorRT-LLM RAG developer reference project, available on GitHub, and developers can use the reference project to develop and deploy their own RAG-based applications for RTX, accelerated by TensorRT-LLM.

Open Source Continues

The release of Chat with RTX is a testament to the ongoing commitment Nvidia has to the open-source community. The decision to allow local processing of AI applications opens up a new frontier for developers and enthusiasts alike. By running these models locally, users have greater control over their privacy and data security while still tapping into the power of cutting-edge AI.

With the compatibility of open-source models like Mistral and Llama, users can now leverage the power of Nvidia GPUs to run sophisticated large-language models directly on their PCs. This local approach is not only a boon for privacy but also for performance, as it reduces the latency typically associated with cloud-based services. As users interact with these AI models, their feedback and modifications can contribute to the larger pool of knowledge, fostering a collaborative environment for improvement and growth.

Closing Thoughts

Nvidia’s latest move with Chat with RTX is nothing short of a bold stride into a future where local AI processing becomes as commonplace as the graphics processing we’ve become accustomed to. The thought of models meticulously optimized for maximum performance on specific hardware is an attractive one. There’s something deeply satisfying about the economy of resources—no excess, no waste—just pure, streamlined efficiency. Nvidia’s understanding of this is clear; by refining their GPUs to tailor-fit the demands of large language models (LLMs), they’re maximizing the value that users get out of their hardware.

This optimization goes beyond sheer performance. It’s the realization that they don’t need to license their GPU architectures to third parties to make an impact in the AI space. Instead, they can be the direct LLM provider, leveraging their hardware expertise to craft a user-friendly AI ecosystem. The integration of desktop retrieval-augmented generation (RAG) is particularly exciting. Historically, local UIs have either overlooked this feature or tacked it on as an afterthought. Nvidia’s holistic approach indicates a keen understanding of what users want and need.

From a personal standpoint, I am thrilled by the user-friendly aspect of their new technology. The ability to easily upload and train one’s own datasets is often considered an advanced task, yet Nvidia appears to making this beginner friendly. For beginners looking to dip their toes into the world of customized AI, this approachability is a significant draw.

Nonetheless, it’s important to recognize that the concept of running a local model, even one powered by RAG, is not necessarily for those with the right hardware. However, Nvidia distinguishes itself not through the novelty of the idea, but in the execution—delivering a seamless, accessible experience for a broad audience.

Related

How to 10x Your LLM Prompting With DSPy

Tired of spending countless hours tweaking prompts for large...

Google Announces A Cost Effective Gemini Flash

At Google's I/O event, the company unveiled Gemini Flash,...

WordPress vs Strapi: Choosing the Right CMS for Your Needs

With the growing popularity of headless CMS solutions, developers...

JPA vs. JDBC: Comparing the two DB APIs

Introduction The eternal battle rages on between two warring database...

Meta Introduces V-JEPA

The V-JEPA model, proposed by Yann LeCun, is a...

Subscribe to our AI newsletter. Get the latest on news, models, open source and trends.
Don't worry, we won't spam. 😎

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

Lusera will use the information you provide on this form to be in touch with you and to provide updates and marketing.