Nvidia is introducing a new tool that enables owners of GeForce RTX 30 Series and 40 Series cards to run an AI-powered chatbot offline on their Windows PC.
Named Chat with RTX, this tool allows users to personalize a GenAI model similar to OpenAI’s ChatGPT by connecting it to their documents, files, and notes for easier queries.
Nvidia explains, “Rather than searching through notes or saved content, users can simply type queries. For example, one could ask, ‘What was the restaurant my partner recommended while in Las Vegas?’ and Chat with RTX will scan local files and provide the answer with context.”
While Chat with RTX comes with support for Meta’s Llama 2 and Mistral’s open source model, users should keep in mind that downloading all necessary files will take up 50GB to 100GB of storage, depending on the selected model(s).
The tool currently works with text, PDF, .doc, .docx, and .xml formats, enabling users to load files into the model’s fine-tuning dataset from a designated folder. Furthermore, Chat with RTX offers the capability to load transcriptions from a YouTube playlist’s videos by entering the playlist’s URL to query their contents.
However, there are restrictions that Nvidia outlines in a how-to guide. For instance, Chat with RTX cannot remember context or provide relevance based on previous questions.
Nvidia also mentions that the app’s responses’ relevance can be influenced by various factors, including question phrasing and the performance of the selected model. Moreover, the app is more of a toy than a production tool.
Nonetheless, the World Economic Forum expects a significant increase in affordable devices capable of running GenAI models offline. Offline models are anticipated to offer better data privacy, lower latency, and more cost-effectiveness compared to cloud-hosted models.
While democratizing tools to run and train models poses potential risks, proponents argue that the benefits outweigh the potential harms. The future will show whether this is true.