Operationalizing AI: Considerations for Choosing a Commercial vs. Open-Source LLM

The emergence of generative AI has created exciting new opportunities for innovation. Across our portfolio and beyond, leaders of growth-stage companies in all sectors are exploring ways to incorporate generative AI for both internal and customer-facing solutions. Most AI-powered features will rely on a Large Language Model (LLM) or a platform that incorporates an LLM, but the choice of LLM should be carefully considered. In this – our latest installment of our content series on Operationalizing AI – we aim to help companies identify whether a proprietary model or open-source solution best suits their needs, highlighting three primary considerations: ease of use, privacy and cost.

Ease of Use

Proprietary models are designed to be “plug-and-play” to minimize integration time and engineering resources. Developed and owned by for-profit companies, these models provide simple APIs that allow licensed users to send prompts and collect answers. OpenAI's GPT models are the most widely used proprietary LLMs, but Google, Anthropic, Cohere and Amazon have similar offerings. These APIs cover various use cases, including audio conversion, chat-based responses, document similarity calculation, fine-tuning a base model, image generation, and identifying objectionable content.

Owners of these models continually enhance their products, and users of these proprietary LLMs naturally benefit from ongoing model improvements – often without having to change any software. In practice, however, the most significant improvements typically come with major version changes (e.g. moving from GPT3.5 to GPT4), which include different pricing and behaviors that may not be backwards compatible. Significant testing and financial impact modeling may be required before upgrading to the next major version of a proprietary LLM.

As an alternative to proprietary models, an ever-increasing range of open-source LLMs are available, many of which are derived from Meta’s Llama and Llama2 models. These open-source LLMs are far from “plug-and-play,” however, and in our experience typically require engineering and data science expertise that growth-stage companies may lack. These models do not come with APIs; instead, users must download the model files and develop custom code to interact with the model. In addition, users must host the model with specialized hardware (GPUs), either on their own servers on in the cloud, and tune model configuration to match available computing resources. Despite the extra effort, it is possible to run an open-source LLM with accuracy, similar to the most advanced proprietary LLMs.

Data Privacy

While proprietary LLMs are easier to use, they come with real data privacy concerns; it’s important to weigh these risks before choosing the path that’s right for your company. Sharing internal information with a third-party for any purpose – AI-driven or otherwise – should always be considered risky. Although most commercial LLM owners have taken steps to clarify data ownership policies and terms of service for fine-tuning and prompts, there is no clear legal consensus on who owns generated answers and how answers can be used. Prompt and training data submitted via APIs is stored by the model owner and potentially used for future training that benefits all customers. This presents a risk of data leaks from the third-party owning the LLM. For example, in one early incident, ChatGPT users reported having access to the chat history from other users.

Ownership of the data used to train models is another important consideration. Proprietary model developers are engaged in an ever-growing list of licensing negotiations and copyright infringement cases with artists and content owners. The legal standard for model training data ownership remains a work-in-progress, and there is not yet clear indemnification from copyright violation liability for the end users of proprietary LLMs.

Open-source LLMs also carry risk related to copyrighted data, though in our estimation, the third-party data sharing risk is much lower because companies that host open-source models are fully in control of their own data sharing and storage practices. Although base model training is carried out by third parties (e.g., Llama is trained by Meta and includes anonymized data from Facebook, among many other sources), any prompt, fine-tuning or answer data is clearly owned and managed by open source LLM users.

Growth Timeline

No items found.

Cost

Cost represents a third – but essential – consideration to weigh when comparing LLMs. Proprietary LLMs can be expensive so it’s important to try to accurately model costs when evaluating the ROI. Pricing is based on the number of word segments grouped as “tokens” in the question (“input tokens”) as well as in the response (“output tokens”) so longer prompts and answers cost more. Fine-tuning a model for a particular use case adds even more cost that’s proportional to the number of tokens in the fine-tuning dataset. OpenAI’s GPT Assistants platform is one low-friction way to fine-tune a model, but data must be stored on OpenAI’s infrastructure where storage costs are roughly 150x the market price of public cloud providers. Costs also depend on the actual model selected. Newer high-performance models cost significantly more than older models. For example, processing 1 million input tokens and 1 million output tokens costs $1.50 and $2.00, respectively, using the GPT3.5 Turbo Instruct model but increases to $5 and $15 using the most performant GPT4o model.

For high-request-volume use cases, the cost of hosting an open-source LLM is likely cheaper than integrating a proprietary solution, but LLM hosting costs are still higher than general purpose computing. The choice of platform or cloud provider will have the most significant cost impact, and the specialized servers required to run LLMs (equipped with CPUs and GPUs) are more expensive than general purpose servers (only CPUs). For example, in the case of one GPU-powered AI hosting service, the costs per hour are roughly 5X, 8X and 16X larger than the on-demand hourly cost of the most expensive CPU-only server instances in AWS.

Don't delete this element! Use it to style the player! :)

Cae Keys

Truemuzic

https://interests.summitpartners.com/assets/DHCP_EP9_FutureHealthCare_DarrenBlack-2.mp3

While it’s easy to get swept up in the hype of generative AI, it’s essential to model the ROI, productivity impacts, ongoing maintenance and data sharing risks before embarking on any new LLM-powered initiatives. Once you’ve identified your product opportunity and determined that an LLM is the right solution, consider the long-term cost and product agility impacts of your chosen LLM approach and align them with the skills in your organization. For example, if you already operate a sophisticated SaaS software stack and have data science expertise on your team, the incremental complexity of hosting an open-source LLM may be much easier to absorb. If, on the other hand, you are looking to quickly experiment with LLM capabilities, proprietary APIs may be a good place to start – just be sure to remain vigilant in protecting your data, keep a close watch on costs and build your solution in a way that will allow for flexibility to switch providers or fine-tune your usage parameters if your LLM-powered solution gains momentum.

About the Authors

Tim Kohn

As Technologist-in-Residence, Tim works closely with the Summit team and our portfolio companies to assist with product, engineering, organization design and architecture strategy. With a career that spans numerous roles at Amazon.com, Amazon Web Services and Amazon Prime Video, Tim brings over 25 years of experience in building high-scale software and web services for consumers and the enterprise.