New AWS service addresses AI hallucinations

[ad_1]

Amazon Web Services (AWS), Amazon’s cloud computing division, is launching a new tool to combat hallucinations — that is, scenarios in which an AI model behaves unreliably.

Announced at the AWS re:Invent 2024 conference in Las Vegas, the service is an automated inference validation service, which validates form responses by cross-referencing customer-supplied information for accuracy. (Yes, the word “verifications” is in lowercase.) AWS claims in a press release that automated inference checks are the “first” and “only” guarantee of hallucinations.

But that’s, well…generously.

The automated inference checks are almost identical to a debugging feature Microsoft rolled out this summer, which also flags AI-generated text that may actually be wrong. Google also offers a tool in Vertex AI, its AI development platform, to allow customers to “model” using data from third-party providers, their own datasets, or Google Search.

However, automated inference checks are performed, available through AWS Foundation stone The form hosting service (specifically Guardrails) tries to figure out how the form arrived at the answer – and determine if the answer is correct. Clients upload information to create a ground truth of sorts, and automated inference checks create rules that can then be refined and applied to the model.

When the model generates responses, automated reasoning checks them and, in the event of a possible hallucination, maps the ground truth to get the correct answer. It presents this answer along with the possible error so customers can see how far off the norm the model may be.

AWS says PwC is already using machine reasoning checks to design AI assistants for its customers. Swami Sivasubramanian, vice president of AI and data at AWS, suggested that this type of tooling is exactly what attracts customers to Bedrock.

“With the launch of these new capabilities, we are innovating on behalf of customers to solve some of the most important challenges the entire industry faces when moving generative AI applications into production,” he said in a statement. Sivasubramanian added that Bedrock’s customer base has grown 4.7-fold in the past year to reach tens of thousands of customers.

But as one expert told me this summer, trying to eliminate hallucinations from generative AI is like trying to remove hydrogen from water.

AI models hallucinate because they don’t actually “know” anything. They are statistical systems that identify patterns in a series of data, and predict what data will come next based on previously seen examples. It follows that the model’s responses are not answers, but predictions of how the questions will be He should It is answered – within a Margin of error.

AWS claims that its automated inference checks use “logically rigorous” and “verifiable” reasoning to reach their conclusions. But the company did not volunteer any data proving the tool’s reliability.

In other Bedrock news, AWS this morning announced Model Distillation, a tool for transferring the capabilities of a large model (such as the Llama 405B) to a small model (such as the Llama 8B) that is cheaper and faster to operate. Reply to Microsoft Distillation in Azure AI FoundryAWS says “Model Distillation” provides a way to try different models without spending a lot of money.

Image credits:Frederic Lardinois/TechCrunch

“After a customer submits claims models, Amazon Bedrock will do all the work to create the responses and fine-tune the smaller model, and can also create more data models, if necessary, to complete the distillation process,” AWS explained in a blog post.

But there are some caveats.

The Distillation model only works with Bedrock-hosted models from Anthropic and Meta at present. Customers must choose a large model and a small model from the same model “family” – models cannot be from different providers. Distilled models will lose some accuracy — “less than 2%,” AWS claims.

If none of that is holding you back, the distillation model is now available in preview, along with automated inference checks.

Also in preview is Multi-Agent Collaboration, a new Bedrock feature that allows customers to assign AI to subtasks on a larger project. As part of Bedrock Agents, AWS’s contribution to the AI agent craze, the multi-agent collaboration provides tools to create and fine-tune AI for things like reviewing financial records and assessing global trends.

Customers can also assign a “supervisor agent” to automatically break down tasks and route them to the AI. The administrator can “(give) specific agents access to the information they need to complete their work,” AWS says, and “(identify) which actions can be processed in parallel and which need details from other tasks before (the agent) can move forward.” “

“Once all of the specialized AIs have completed their inputs, the supervising agent can (collect) the information together (and) compile the results,” AWS wrote in the post.

It looks elegant. But as with all of these features, we’ll have to see how well they work when deployed in the real world.

[ad_2]

Leave a Comment Cancel reply