Data Hurdles | Transcript: EU's AI Act: A Journey from Open Source Tech to High-Stakes Policy

EU's AI Act: A Journey from Open Source Tech to High-Stakes Policy

May 20, 2023 / 28:52/E12 Download MP3

Christopher Detzel: Well, my son wants to watch Mario Brothers, and I think we might go see that. He's 12. So, it's very cool.

Christopher Detzel: Earlier this week, you sent me an article on the EU AI Act targeting US open software. It's intriguing, especially as it examines the AI large language models. We could discuss the open source aspect in a bit.

Christopher Detzel: Maybe, Michael, you could talk a bit about the implications of this act on the open source domain of large language models.

Michael Burke: The reason the EU has put this act together is to regulate the use and deployment of artificial intelligence within the EU. This has become especially relevant with large language models, given the EU's emphasis on protecting privacy and transparency. This new wave of technology adds another layer of complexity that we must navigate globally.

Christopher Detzel: How does privacy factor in when it comes to large language models?

Michael Burke: When you're inputting information into these platforms, they're saving that information. Also, when the models are trained, they're trained off of the internet. This is where concerns arise for organizations and governments like the EU. There's worry about where the data is coming from, how accurate the information being fed into the machine is, and what the implications could be if left unmonitored. Citizens adopting and using these technologies may face serious ramifications.

Christopher Detzel: So, are open-source large language models exempt from this act?

Michael Burke: Essentially, the act says that these companies have to restrict access. They can't provide open access to their models; they have to register through a process, provide source information, and details about the use cases. This will significantly disrupt the current business models of these companies, particularly those heavily reliant on API-based interactions with AI systems. Open-source large language models aren't exempt from this. That's concerning because it means that if I open-source a large language model as a research project, I could be held liable if it's used within the EU.

Christopher Detzel: That does sound quite alarming, indeed. It seems like it could seriously hinder online publication and sharing of information and innovation.

Michael Burke: So, LoRa, is essentially an interface or application for low rank adaptation of large language models. If it were to be banned, it would have significant impacts. LoRa is a technique used to gradually add new information and capabilities to a model. Consider some of these large language models, which cost over a billion dollars to train due to the sheer volume of information and billions of parameters involved.

For your average researcher, or even a university, undertaking that level of training on their own is unaffordable. Instead, they utilize techniques and platforms like LoRa to modify and adapt these pre-existing models to incorporate new technologies.

If we were to ban the use of these types of open-source models, we would in turn be eliminating a lot of opportunities for individual learning and exploration of these new technologies. This would prevent people from running these types of experiments, which are significantly cheaper, safer, and easier to check, than attempting to run entire programs independently.

In my opinion, there's a significant misalignment between the proposed act's consequences and its actual ramifications. It's not just LoRa, but also other similar applications and implementations that will face substantial challenges, hindering forward innovation.

Christopher Detzel: Can you talk about what this ban would mean for companies like OpenAI, Amazon, Google, IBM, and the like? I presume it would affect them because people in Europe are likely using their technologies.

Michael Burke: Indeed, this could in some ways benefit these larger companies as it would mean they would be the only ones with access to this technology. If they had to pull back their open-source technology in some way, I'm unsure how that would work. But if this ban were to be put in place, the use of these AI models would likely decrease significantly, thus slowing the pace of AI innovation. It could disrupt many business models for emerging tech companies that are building around large language models and generative AI, creating substantial barriers to entry for smaller development firms and startups
.
Christopher Detzel: Do you see this affecting companies that have developed their own chat interfaces using APIs from entities like OpenAI and others to build their models and use them within their technologies?

Michael Burke: Yes, absolutely, it could stunt them entirely. There will likely be workarounds, as with every EU regulation, perhaps another clause on the term sheet that users have to accept, stating they're not based in the EU. But the potential for this law to lead us down a path of future issues is a significant concern.

When you start banning open-source technology because of the threat of misuse, and holding the open-source originator liable, it feels like a total misalignment. For example, if you wrote a paper about jumping off a bridge and someone decides to jump off a bridge, should the author be held responsible? That's the sort of analogy we're looking at here. And that's not even touching on future technologies that could also be treated in this manner.

Christopher Detzel: What do you think will be some of the penalties for these companies that make unlicensed models available in the EU?

Michael Burke: I know the fines are quite large - 20 million euros or 4% of their worldwide revenue. It's not just about where this actually gets enforced, it's about passing a law that targets the open-source community. There have to be better ways to enforce accountability, authenticity, and traceability with these large language models than simply going after the end user. It's a grossly oversimplified approach.

Michael Burke: This is interesting as they have a lot of ambiguous wording, like high-risk AI projects with their anticipated functionality. The ambiguity extends to what's required to comply, like registrations and expansive risk testing. There's also talk about third-party assessments by various EU States, but the details are unclear.

Christopher Detzel: Is it worth it for organizations to invest in this type of process or should they keep these matters outside of the EU?

Michael Burke: There are decisions large organizations have to make, like whether to keep their operations outside of the EU or not. Every version of their product will need to undergo the same rigorous compliance process, which could potentially slow down operations due to the traditionally slow pace of governmental operations.

Christopher Detzel: How do you think restrictions on API uses potentially limit innovation, particularly for startups and developers outside of the EU?

Michael Burke: Like with GDPR, some companies may choose to differentiate between EU and non-EU users. Others might decide to adhere strictly to the EU regulations, potentially ceasing to publish open source, large language models. This could slow down innovation and make the ecosystem more difficult to navigate. Educational institutes will also be impacted, as they often involve international collaboration.

Christopher Detzel: What are the implications of the liability clauses in the Act for open source developers and distributors of AI software?

Michael Burke: Open source developers can be held liable even if they are not in the EU. If they release a model and somebody else misuses it, they can be held responsible.

Christopher Detzel: How do you feel about the Act and how do you think it should be approached?

Michael Burke: It's essential to focus on how the model is being used and restrict certain areas that could potentially cause bodily harm. Building better guidance for the use of such technologies is crucial. We should think about increasing transparency, documenting source systems, data models are trained on, and who's responsible for updates and changes. This is particularly relevant when critical decisions on people's lives are being made based on these models.

Creators and Guests

Host

Chris Detzel

Co-Host at Data Hurdles

Host

Michael Burke

Co-host at Data Hurdles

EU's AI Act: A Journey from Open Source Tech to High-Stakes Policy

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere