Snowflake Cortex is a fully-managed service that enables access to industry-leading large language models (LLMs) is now generally available. You can use these LLMs in select regions directly via LLM Functions on Cortex so you can bring generative AI securely to your governed data. Your team can focus on building AI applications, while we handle model optimization and GPU infrastructure to deliver cost-effective performance.
Here is the full set of updates released today as part of our mission to provide efficient, user-friendly and trusted generative AI:
The combination of these updates continues to unlock value across industries, with two use cases in particular:
As state-of-the-art models continue to advance, customers need flexibility to quickly and securely test and evaluate models to get the best results for their use case. This is why Snowflake Cortex is adding support for:
Using any of these models against your data is as simple as changing the model name in the COMPLETE function, available both in SQL and Python.
To accurately answer business questions using LLMs, companies must augment pretrained models with their data. Retrieval Augmented Generation (RAG) is a popular solution to this problem, as it incorporates factual, real-time data into the LLM generation.
Snowflake customers can now effortlessly test and evaluate RAG-oriented use cases, such as document chat experiences, with our fully integrated solution. Arctic embed is now available as an option in the Cortex EMBED function. The EMBED and vector distance functions, alongside VECTOR, as a native data type in Snowflake are currently available in public preview, with general availability coming soon. With all of this natively built into the Snowflake platform, there is no need to set up, maintain and govern a separate vector store. This cohesive experience accelerates the path from idea to implementation and broadens the range of use cases organizations can support.
Ready to build your own document chatbot in Snowflake? Try this step-by-step quickstart.
We continue to develop more advanced and efficient retrieval so that enterprises can securely talk to their data, and we do so in a way that is open and collaborative to push the industry forward. With this approach in mind, we open-sourced Arctic embed, the world’s best practical text-embedding model for retrieval and recently announced a partnership with University of Waterloo to continue evolving retrieval benchmarks.
At Snowflake, we prioritize maintaining high safety standards for gen AI applications. As part of our ongoing focus and partnership with Meta, the Llama Guard model is natively integrated into Snowflake Arctic (with availability expanding soon to other models) to proactively filter out any potentially harmful content from LLM prompts and responses. Snowflake Arctic, combined with Llama Guard, minimizes objectionable content in your gen AI applications, ensuring a safer user experience for all. Llama Guard from Meta has been instruction-tuned, based on Meta’s collected data set. It demonstrates strong performance on existing benchmarks, matching or exceeding the performance of currently available content moderation tools. The model can identify a specific set of safety risks in LLM prompts and classify the responses generated by LLMs to these prompts. We also plan to offer our customers the ability to use Llama Guard with other models in Cortex soon.
Data security is key to building production-grade generative AI applications. Snowflake is committed to industry-leading standards of data security and privacy to enable enterprise customers to protect their most valuable asset — the data — throughout its journey in the AI lifecycle, from ingestion to inference. The high security bar can be applied to all of Cortex; whether using one of the task-specific functions, such as summarize, or a foundation model from Snowflake, Mistral AI, Meta or any other, the following is always true:
You can find more details in our AI Trust and Safety FAQ and our AI Security Framework white paper.
Snowflake Cortex LLM functions incur compute cost, based on the number of tokens processed. Refer to the consumption table for each function’s cost in credits per million tokens. The capability is available in selected regions. Refer to the region feature matrix for more details.
Snowflake Cortex enables organizations to expedite delivery of generative AI applications with LLMs while keeping their data in the Snowflake security and governance perimeter. Try it out for yourself!
Want to network with peers and learn from other industry and Snowflake experts about how to use the latest generative AI features? Join us at Snowflake Data Cloud Summit in San Francisco from June 3–6.
The post Snowflake Cortex LLM Functions Moves to General Availability with New LLMs, Improved Retrieval and Enhanced AI Safety appeared first on Snowflake.
The latest Google Messages update introduced a new bug that’s making contact photos disappear. The issue appears to only affect […]
In this digital age, photos have become one of the most shared and stored types of content online. Whether it’s […]
The automotive industry has undergone a seismic transformation over the last several decades. In 2005, most vehicles were mechanically sophisticated […]