Exploring the Frontier of Effective &
Trustworthy General-Purpose AI
Highlights from the European Big Data Value Forum 2023
Introduction
The European Big Data Value Forum (EBDVF) is an annual event that brings together professionals, researchers, and industry leaders in the field of data science and big data analytics.
One of the focal points, this year, has been the ADR (AI, Data, Robotics), its long-term goal of achieving a convergence of the areas, the continuous trends/gaps analysis of all significant overlaps and interactions among these existing communities, as well as the considerable danger of favouring convergence at the expense of the individual areas’ development, where significant technological advances are needed to bring novel products to market.
The key recurring theme was how we can, realistically and quickly, implement Effective and Trustworthy General-Purpose AI. Can we effectively balance out the appetite for and capacity of innovation advancements with the ethical and legal regulation requirements? This is too complicated a question to answer with just a “yes” or “no”, however, we can say with certainty that when AI adoption supports governance informing compliance as two of its pillars, then that balance becomes considerably easier to calibrate.
In terms of Compliance, this is achieved by continuously building on trust, privacy and security in an informed by science and regulations manner. Given how fast the progress is and how vast the real-world scenario areas impacted by this progress are, it was clearly stated by multiple speakers that we need to build on security beyond compliance, sooner than later.
Let’s explore some of the focal points in a bit more detail below.
(Very) Wide Cross-Industry AI Adoption:
Public Administration, Smart Manufacturing, Cities, Healthcare, Agri-food and Energy Supply to name a few of the industries on the journey of adopting AI in their everyday business (or research in some cases) activities. AutoML is an area which to a large degree is being driven by European researchers and is focusing on continuous improvement of the AI ecosystems efficiency by configuring the machine learning pipelines, selecting the appropriate model architectures and performing systematic hyper-parameter optimisation.
One of its “big picture” targets is supporting the enablement of organizations, which may be lacking in deep technical knowledge and significant hands-on experience, to take advantage of the latest AI advances. Currently focusing on developing approaches flexible enough to scale up, without necessarily having extensive resources, so as to deal with the, ever growing in size and complexity, pipelines and models.
Sector-Based Approach AI:
Most AI initiatives combine elements and best practices of both sector-based and cross-sector approaches to harness the benefits of specialization, while ensuring broader applicability. The choice of approach often depends on the specific goals and needs of a given project or organization.
The disadvantages of the sector-based approach are well understood, e.g. limited transferability, flexibility and narrow focus which may lead to duplication, siloed organizations and large implementations that become too quickly obsolete. However, it was put by a lot of speakers that we should not disregard easily the sector based AI approaches not just because of the tailored solutions’ potential for efficiency but, equally importantly, because of their ability to facilitate Regulatory Compliance, especially so in cases of highly “sensitive” sectors, such as healthcare, finance and defense to name a few.
Data-Driven Innovation and Data-Informed decision making:
In a data-driven innovation process, organizations collect, analyze, and interpret data to generate insights that drive the creation of new products, services, or business models. Data-Informed Decision Making refers to the practice of using data and analysis to support and guide the decision-making process, with data being only one of several factors considered for this process, albeit an important one, within an organization.
Are Language Large Models aka LLMs the silver bullet for AI evolution and worldwide adoption?
Actually learning from human feedback, although not interchangeable with LLMs, may be efficient enough to compensate for one of their key drawbacks: their ever increasing appetite for data; the size of these models grows faster than the availability of data to train them on.
Seen as a few-shot learning problem, the human feedback approach provides opportunities for individuals to personalize models or to steer them towards styles and topics that are relevant for them, i.e. extracting more valuable content from the same amount of data, or ideally even less data. Hence the need for an ever increasing amount of data may be significantly reduced. May also be an extra tool in the toolkit for achieving the much and urgently needed verification of original vs generative AI produced content.
Regulated AI Sandboxes even across borders:
Europe needs a single functioning integrated ecosystem covering Data and AI. No single country/stakeholder/organization/community can have the technology, skills, knowledge and physical assets required to undertake all innovative development and deployment; even more so for scaling the deployment of these technologies in real-world applications. This ecosystem cannot be unregulated, not even the entirely experimental components; they also cannot be regulated only by quite loose “suggesting-only” rather than “restricting” acts.
Risk management or Hazard prevention and damage control for a well regulated and innovative AI:
An innovative but well-regulated AI system requires an extensive understanding of the risk and challenges presented by the nature of such technologies. At the moment it appears as if most approaches so far, while intending to perform risk management, they have somehow ended up being more about hazard prevention, resulting unintentionally, in favoring compliance to such an extent that it has, occasionally but significantly, hindered innovation.
Continuous adaptation to evolving risks and challenges is crucial in ensuring efficient, valuable as well as responsible and trustworthy development and deployment of AI technologies. Hence it is essential to re-evaluate our AI development and deployment approaches and make sure that they include an appropriately blended “mix” of risk management and hazard prevention, i.e. a holistic approach to risk management, hazard prevention, and damage control, involving a combination of technical excellence measures, ethical principles and regulatory compliance.
Sustainability and Interoperability:
Sustainability was one of the key themes addressed given the severe challenges to Europe’s future sustainable development and welfare. The consensus was that it has now become extremely critical to tackle challenges related to access to essential resources, e.g. food, water, energy as well as key technologies and materials etc.
Knowledge graphs have been used to model and analyze complex systems, such as energy grids and transportation networks with considerable success and fit naturally with Interoperable and Integrated frameworks for data ecosystems. The latter being critical for the quick and efficient integration and testing of many different, sophisticated, cutting edge techniques and solutions into complex systems by end users. Such solutions may take advantage of pre-competitive datasets, natural language models/LLMs, smart automation and (cyber-) security current advancements.
Cutting-Edge Emerging Research Topics:
The EBDVF is a hub for cutting-edge research in the field of data science where researchers from academia and industry present their latest findings, methodologies, and technologies. This year there were extensive discussions on groundbreaking ADR technological foundations, creative hybrid AI methods, quantum computing and metaverse enablers smoothing the transition to the virtual worlds, to name a few.
Cross-section ADR evolves in the direction of optimizing autonomy, high-performance and predictability; such advancements naturally then expand into the “next generation” smart embodied robotic systems (soft robotics), where the focus is shifted towards manipulation, configurability and effective human robot interaction.
Neuro-symbolic hybrid AI methods are trying to find systematic and well-grounded approaches to combine symbolic and neural representations which often also entail a principled approach to combine reason and learning. These hybrid approaches are seen as one major avenue to be able to live up to the more demanding upcoming EU AI regulations; it comes as no surprise that Europe is making significant contributions in this area.
Quantum Computing combines quantum physics and computer science and is based on the two fundamental concepts of superposition and entanglement. Unlike classical computing bits that have two states – 0 or 1 – a qubit can represent two states at the same time. Hence, optimisation problems like route planning, supplier management, and financial portfolio management are places where quantum’s unique ability to quickly find the optimal solution by analyzing huge amounts of heterogeneous data would work well.