Australia’s modern slavery laws present an opportunity to stop artificial intelligence models from spreading misinformation, according to a new report that warns state actors and cyber criminals could “poison” AI by manipulating datasets.
The report from the Cybersecurity Cooperative Research Centre, titled ‘Poison the Well: AI, Data Integrity, and Emerging Cyber Threats’, also urges the government to adopt better oversight of datasets used to train AI, similar to the European Union’s forthcoming AI Act.
In Australia, firms earning more than $100 million a year are required to produce annual reports scrutinising supply chains for modern slavery risks under the Modern Slavery Act, which the CSCRC believes could be leveraged to monitor AI development. The objective is to mitigate the potential risk of data poisoning and label poisoning.
“For example, captured entities that are using AI systems in their operations may be required to provide details of third-parties through which AI technologies are procured and produced. And for businesses developing their own AI systems, details of training data sets and their origin could be provided,” the report reads.
Data poisoning refers to the creation of websites, or editing of existing websites, to contaminate a dataset with malicious data when the internet is scraped for AI training data.
Meanwhile, label poisoning refers to the coercion of workers, predominantly in developing countries, tasked with labelling large language model data to deliberately mislabel harmful data so that it is included in the final AI model. This might include labelling abuse material as “something harmless, like a tree, cat or bag”.
Only 0.01 per cent of the data in a training set needs to be manipulated for “targeted mistakes to be introduced in model behaviour”, according to a study supported by Google and ETH Zurich, referenced in the CSCRC paper, which proposes how a poisoning attack might occur.
However, data and label poisoning are emerging threats and there is no evidence of these attacks having occurred.
When asked if the details of a training dataset revealed through the Modern Slavery Act could be sufficient to meet the CSCRC’s transparency objectives, chief executive Rachael Falk noted said “it is up to government to work with industry to co-design guidance related to AI and the Modern Slavery Act”.
The report notes that the AI supply chain is currently opaque with “most of the companies developing LLMs do not release detailed information about the data sets that have been used to build their models”.
“As recently highlighted by the UK’s National Cyber Security Centre (NCSC), AI models are only as good as the data they are trained on. And this is where things get blurry, because there is often a lack of transparency as to where training data comes from and, therefore, its integrity is a key issue for consideration,” the report reads.
The report recommends adopting a broad provision from the EU AI Act as a “pragmatic first step in achieving regulatory oversight”.
“Article 10 of the EU’s proposed AI Act provides a clear guide in relation to data set oversight that is being in considered. In this context, AI data set transparency would be required in relation to ‘high risk’ AI models and may offer a good starting point for Australian regulators to work from,” CSCRC chief executive Rachael Falk told InnovationAus.com.
High-risk AI systems are those that “negatively affect safety or fundamental rights”, used in products under the EU’s product safety legislation or are in one of eight specifically named areas including biometric identification, and education and vocational training.
Under the EU Act, high-risk AI models will be broadly subject to “appropriate data governance and management practices, be relevant, representative, free of errors and complete, and take into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, behavioural or functional setting”.
Industry and Science minister Ed Husic departed for the United Kingdom on Monday, where he will attend an AI Safety Summit hosted by UK Prime Minister Rishi Sunak.
Overnight, the G7 reportedly signed an agreement to establish a voluntary code of conduct for companies developing AI, which the CSCRC would back if implemented in Australia.
Mr Husic joins talks at the global stage as the Department of Industry, Science, and Resources considers regulatory reforms for supporting responsible AI in Australia.
Do you know more? Contact James Riley via Email.