网络犯罪有史以来最大的人工智能时刻刚刚发生了

Raef Meeuwisse
作者: Raef Meeuwisse, CISM, CISA, Author of 人工智能 for Beginners
发表日期: 2023年3月24日

A monumental development has just taken place in the AI realm, 如果你在网络安全领域工作, 你很快就会意识到它的含义.

My own expertise stems primarily from cybersecurity but I have spent the past few years buried in various strands of research and this includes tooling up to understand AI.

Many of you will also have a growing awareness of AI and most have by now at least 采样ChatGPT, an awesome text-based chatbot AI capable of real-time interactions and boasting deep knowledge across more disciplines than any human in history. 这已经给网络安全带来了一些问题, as various functions from this AI can be called via APIs (Application Programming Interfaces) at very low cost.

The misuse of ChatGPT and other commercial AI platforms is at least restricted by policies – but that still requires any unethical or illegal use to be detected. These AIs will not intentionally set out to break any laws but those without any ethics are busy finding ways to circumvent the rules.

Let me cut to the endgame here before I explain how it happened. Imagine what would happen if cybercriminals could get an AI as capable as ChatGPT 3.5 or 4.0, 但不是一个庞大的数据中心, be able to run a wholly independent instance on a standalone machine – where they can decide what rules or policies it abides by?

It is technically illegal for cybercriminals to reuse this work, 但是,通过几个政党的努力, it has proven possible to take an AI model with the power of ChatGPT 3.5 (an AI that requires a massive data center just to get its basic functions running) and create a much tinier and more efficient version that has been able (in a small number of tests conducted so far) to outperform it.

事情是这样的:

We have long been warned that once AI arrived, its development would be exponential.

A group of researchers at a Stanford-based research team were able to use just 175 different manually created tasks (self-instruct seed tasks), and using these in combination with an API connection to ChatGPT 3.5(有兴趣的可以看看达芬奇的版本), they were able to get into a cycle of automated generation until they reached a sample size of 52,000年的对话.

They then fed these samples into a separate AI (Metas Llama 7B) and fine-tuned it. 到这里为止, the model was able to compete effectively with the original and the derivative AI still required some hefty cloud computing (but a fraction of what GPT would run on).

The execution of the processes above was measured in hours.

It is worth noting that these tasks were only permitted for research purposes as various terms and conditions at OpenAI prohibit the use of outputs from GPT to create rival models.

有了这样的成就 在公开场合,研究人员提供了所有可用的关键数据. 他们称由此产生的人工智能为聊天机器人模型 羊驼7 b.

对这一结果的可能性感到兴奋, further parties have worked to see just how much further the model could be compressed. The process used is called LoRA (stands for Low-Rank Adaptation), and what it seeks to do is perform 降维 例如,在各个方面都可以, 消除冗余功能, 简化识别特征, 在很多情况下, reduce vast grids of multi-dimensional formulae to single numbers.

LoRa定义段

What this compression has managed to do is to get the model so small that it can reportedly run on something as small as a Raspberry Pi (as the disclaimer says, 作研究用途).

http://twitter.com/_akhaliq/status/1636421203626737686

羊驼Lora 7b Tweet

Although questions arise about just how far the compression can go, and what dependencies it may continue to have in the very short-term, the implication of this overall event in the context of cybersecurity is huge.

It is evidence that the theft and repurposing of vastly powerful AI models is (as of right now) not only within the reach of cybercriminals, but able to work from very small and inexpensive hardware.

这意味着作为一个行业, we can forget about relying exclusively on the policies and controls of large AI companies to prevent the malicious use of AI. Savvy cybercriminals everywhere are now able to steal and repurpose AI in ways that, 直到几周前, we thought might have been prevented by the sheer scale and cost of the computational resources required.

Strap in and start locking down your systems because in 2023, it's crucial to strengthen our digital defenses and prepare for the latest AI-driven challenges in cybersecurity.

更新: 自从写了这篇博文, further testing and use of the 羊驼7 b revealed that it did not continue to outperform ChatGPT and was prone to “hallucination” – the term affectionately applied by AI people to the feature (bug) where an AI may fill the gaps in its knowledge by confidently making things up. That does not negate the importance of this moment and significant step forward it represents in the ability to create powerful AI on very small computing instances.

作者附言:人工智能 for Beginners” is released in paperback, 精装本及电子书将于2023年5月22日发售: http://www.amazon.com/dp/B0BZ58JHGD