[ad_1]
The so-called “inferential” AI model, QwQ-32B-Preview, has arrived on the scene. It is one of the few programs that rivals OpenAI’s o1, and is the first available for download under a permissive license.
QwQ-32B-Preview, developed by Alibaba’s Qwen team, contains 32.5 billion parameters and can consider claims up to approximately 32,000 words in length; It performs better in certain benchmarks than o1-preview and o1-mini, the two logic models OpenAI has released so far. (The parameters roughly correspond to the model’s problem-solving skills, and models with more parameters usually perform better than those with fewer. OpenAI does not disclose the number of parameters for its models.)
According to Alibaba tests, the QwQ-32B-Preview outperforms OpenAI’s o1 models in AIME and MATH tests. AIME uses other AI models to evaluate the model’s performance, while MATH is a set of word problems.
The QwQ-32B-Preview can solve logic puzzles and answer reasonably difficult math questions, thanks to its “reasoning” capabilities. But it’s not perfect. Alibaba notes in a Blog post The model may switch languages unexpectedly, get stuck in loops, and perform poorly on tasks that require “common sense.”
Unlike most AI, QwQ-32B-Preview and other inference models actively verify facts. This helps them avoid some of the pitfalls that usually cause models to falter, with the downside being that they often take longer to come up with solutions. Similar to the o1, the QwQ-32B-Preview reasons through tasks, pre-plans, and performs a series of actions that help the model obtain answers.
QwQ-32B-Preview, which can be run and downloaded from the AI development platform Hugging Face, appears to be similar to the recently released DeepSeek inference model in that it treads lightly on certain political topics. Alibaba and DeepSeek, two Chinese companies, are subject to… Performance measurement By China’s Internet Regulatory Commission to ensure that their models’ responses “embody core socialist values.” a lot Chinese AI systems refuse to respond to topics that might raise the ire of regulators, such as speculation about artificial intelligence Xi Jinping order.

When asked “Is Taiwan part of China?”, QwQ-32B-Preview responded that it was (and also “inalienable”) – a perspective not in keeping with most of the world but consistent with the view of China’s ruling party. Prompts about Tiananmen SquareAt the same time, it resulted in no response.

QwQ-32B-Preview is available “openly” under the Apache 2.0 License, which means it can be used for commercial applications. But only certain components of the model have been released, making it impossible to replicate the QwQ-32B-Preview or gain insight into the inner workings of the system. The “openness” of AI models is not a settled matter, but there is a general continuum from the most closed (API access only) to the most open (model, weights, exposed data) and this model falls somewhere in the middle.
The growing interest in inference models comes at a time when the validity of “scaling laws,” long-standing theories that throwing more data and computing power at a model will continually increase its capabilities, is under scrutiny. A wave Press reports indicate that models from major AI labs including OpenAI, Google, and Anthropic are not improving as dramatically as they used to.
This has led to a scramble towards new AI approaches, architectures, and development techniques, one of which is test time calculation. Also known as inference computation, test-time computation gives models additional processing time to complete tasks, and supports models such as o1 and QwQ-32B-Preview. .
Major labs along with OpenAI and Chinese companies are betting that test-time computing is the future. According to a recent report from The Information, Google He has It expanded an internal team focused on inference models to about 200 people, adding significant computational power to the effort.
[ad_2]