Skip to content

The Balancing Act: Ensuring Ethical Data and Fair Use in AI

Kelsi Kruszewski

The Balancing Act: Ensuring Ethical Data and Fair Use in AI

Generative AI offers the world boundless opportunities for creativity and productivity, but calls for more effective governance are growing louder. The need for AI regulation goes beyond policing misinformation and protecting human expression; user concerns point to a persistent lack of trust. At its core, the challenge of AI integration is making sure organizations are collecting, using, and regenerating the right data. It’s about practicing nuanced controls and applying ethical guidelines to prioritize fair use, accuracy, and integrity.

MITRE’s Recommendations for AI Governance

MITRE has recently proposed new AI recommendations for the incoming presidential administration, emphasizing the urgency of AI governance. They suggest issuing an executive order mandating AI governance measures within the first six months of the administration’s term. Additionally, MITRE recommends an executive order that mandates system auditability and the development of standards for audit trails. This would include requiring AI developers to disclose the data used to train their systems and the foundation models on which their systems were built. According to MITRE, the ability to audit systems is vital for tracking AI misuse and maintaining public trust in AI.

Corporate Efforts in AI Governance

Companies like IBM and Salesforce are now allocating resources to internal ethics committees dedicated to AI governance. Lenovo Group, Telefonica, Microsoft, and Mastercard, among others, have also signed a new agreement to build ethical AI by integrating UNESCO’s recommendations.

Around the world, organizations and institutions are striving to create trustworthy AI models by conducting regular bias audits and ensuring diverse, representative training data. More than ever, leaders are looking to integrate digital record-keeping solutions like blockchain to gain granular visibility into their data and hold models accountable to emerging ethical standards.

Evolving Intellectual Property’s Definition

The democratization of AI capabilities has created all kinds of opportunities possible for the average person. Today, anyone can pay a small subscription fee, if any, to use generative AI. Up-and-coming artists can pay $20 per month to leverage Stability AI’s open-source models for creating content and producing music. Developers are able to write code faster with tools like Tabnine and OpenAI’s Codex. Those with disabilities can also access speech-to-text and text-to-speech systems powered by AI to more effectively and easily communicate with others.

But with enhancing access to AI comes the risk of misappropriation. Its widespread use has far-reaching implications for privacy and intellectual property, as it blurs the lines of ownership by generating content that may be derived from existing works.

Leveraging blockchain technology can address this. Smart contracts – self-executing contracts that trigger actions based on predetermined terms and conditions– automate the licensing of creative works, distinguish AI content from human content with digital tags or unique watermarks, and authenticate the reliability of content by comparing it against a trusted registry of AI models. These contracts improve accountability by maintaining a verifiable record of a work’s lifecycle and ownership.

With a decentralized blockchain network, the risk of unauthorized access or modifications could be drastically reduced – and, ultimately, help establish a trusted, transparent ecosystem where data privacy and intellectual property rights are adequately protected.

Rooting Out Hallucinations

In addition to ensuring integrity and fair use, the democratization of AI has also raised questions about the accuracy of generated outputs. A recent study by startup Vectara shed light on a prevalent issue: the dangerously high rates of hallucination among open-source LLMs available to the public. Even the most reliable models have at least a small chance of fabricating a response to a query.

Neuroscientist Olivier Ouillier explained that AI solutions don’t have any intention to deceive or manipulate – they’re simply engineered to predict the most statistically likely strong of words. LLMs compress massive volumes of training data for efficiency, which means the fine details can get lost in their calculations. To fill the gaps, the AI system scrambles to make something up.

Blockchain can help companies address this challenge. Using a decentralized record-keeping system, businesses can easily verify the provenance and quality of data being used to train the model and root out inaccuracies that could lead to hallucinations. Today, the Hedera blockchain, which Prove AI is to run on, gives developers the ability to “go back in time” to visit previous iterations of AI systems to understand how their model leaders, operates, and adapts – version control.

Integrated within IBM’s watsonx platform, Prove AI will enable real-time monitoring of input and output data for training AI models. It will equip organizations with the power to vet their AI systems, ensure their reliability, uphold ethical standards, and scale their impact with trustworthy data.

Documenting a model's entire lifecycle in an immutable ledger – from data acquisition, to training and testing, to setting parameters and optimizing refinement – empowers users to gain a clearer understanding of how their AI systems reached specific conclusions.

To learn more on how Prove AI, visit our Prove AI landing page or contact us directly.