Webinars
Continuous Testing and Monitoring of Large Language Models
Deploying large language models (LLMs) in healthcare requires more than high initial accuracy – it demands ongoing testing and monitoring to ensure safety, fairness, and compliance over time.
Pacific AI provides a comprehensive governance platform that supports both development and production needs. During development, test suites can be integrated into CI/CD pipelines so every model version is validated before release. Once live, continuous monitoring detects drift, performance degradation, or safety issues in production systems, helping organizations maintain trust throughout the full lifecycle of their AI.
To achieve this, Pacific AI combines three specialized test engines:
- MedHELM provides benchmarks designed by medical experts, grounded in real-world healthcare needs, and validated on real-world data. It focuses on whether LLMs deliver accurate, clinically useful answers when applied in practice.
- LangTest generates systematic variations of datasets to test dozens of bias and robustness dimensions. This ensures that models produce consistent and fair outputs across patient populations, edge cases, and wording changes.
- Red Teaming executes adversarial safety tests, covering 120+ categories of unsafe or undesirable behaviors. Using both semantic matching and LLM-as-a-judge techniques, it probes whether models comply with safety, policy, and compliance requirements.
Together, these engines provide comprehensive coverage of accuracy, robustness, and safety risks — supported by audit trails, role-based access, and versioned test suites.
Join us to see how Pacific AI helps organizations deploy and operate LLMs responsibly, with continuous assurance that models remain accurate, safe, and compliant.

Alex Thomas is a Principal Data Scientist at Pacific AI. He’s used natural language processing, machine learning, and knowledge graphs on clinical data, identity data, job data, biochemical data, and contract data. Now, he’s working on measuring Large Language Models and their applications.

Alin is an experienced big data engineer with a demonstrated history of working in the information technology and services industry. He has extensive expertise in applying petabyte-scale technologies to make sense of vast amounts of unstructured and structured data and extract valuable insights to drive strategic decision-making. Alin has hands-on experience with AWS, Google Cloud, Microsoft Azure, various open-source technologies, and best practices in aligning them with AI-driven strategies to optimize scalability and performance. Alin has a Master’s degree focused on Artificial Intelligence from West University of Timisoara.
Healthcare-Specific Red Teaming
Large language models (LLMs) hold immense promise for advancing clinical workflows, yet their deployment in healthcare raises critical safety, ethical, and bias-related concerns that exceed the scope of standard red‑teaming practices. In this talk, we first review the fundamentals of general‑purpose LLM red teaming—targeting misinformation, offensive speech, security exploits, private‑data leakage, discrimination, prompt injection, and jailbreaking vulnerabilities. Building on these foundations, we then describe two healthcare‑specific extensions developed by Pacific AI:
- Medical Ethics Red Teaming
We introduce novel test cases derived from core AMA medical‑ethics principles to probe LLM behaviors around physician misconduct, patient autonomy and consent, conflicts of interest, and stigmatizing language. Examples include attempts to coerce consent for unnecessary procedures, fabricate arguments for upcoding, and manipulate clinical documentation for financial gain. - Cognitive‑Bias Red Teaming
We demonstrate targeted benchmarks designed to elicit and measure clinically dangerous biases such as anchoring, confirmation, framing, primacy/recency effects, and ideological alignment, that can distort diagnostic reasoning and treatment recommendations. Through scenario‑based assessments (e.g., risk ‑communication framing, order‑set anchoring), we quantify model susceptibility to contextual and statistical framing errors in healthcare contexts.
This webinar is designed for healthcare technology leaders, clinical AI researchers, and compliance officers seeking practical guidance on evaluating and governing AI tools; attendees will learn actionable red‑teaming strategies and receive ready‑to‑implement test cases to bolster model safety, ethics compliance, and bias mitigation in clinical settings.

David Talby is a CEO at Pacific AI and John Snow Labs, helping healthcare & life science companies put AI to good use. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a PhD in computer science and master’s degrees in both computer science and business administration.
The State of AI Governance
This webinar presents key findings from the 2025 AI Governance Survey, conducted in April – May of 2025 by Gradient Flow to assess the priorities, practices, and concerns of professionals and technology leaders in this space. Topics covered:
- Stages of adoption by AI developers and deployers
- Adoption of formal AI Governance policies and roles
- Implementation of processes for AI literacy training and incident response
- Regulatory frameworks that are studied or adopted
- Implementation of best practices and what drive prioritization
- Use of tools such as red teaming, bias mitigation, and model cards ial, and reputation risks.

David Talby is a CEO at Pacific AI and John Snow Labs, helping healthcare & life science companies put AI to good use. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a PhD in computer science and master’s degrees in both computer science and business administration.

Ben Lorica is founder at Gradient Flow. He is a highly respected data scientist, having served leading roles at O’Reilly Media (Chief Data Scientist, Program Chair of the Strata Data Conference, O’Reilly Artificial Intelligence Conference, and TensorFlow World), at Databricks, and as an advisor to startups.
He serves as co-chair for several leading industry conferences: the AI Conference, the NLP Summit, the Data+AI Summit, Ray Summit, and K1st World. He is the host of the Data Exchange podcast and edits the Gradient Flow newsletter.
AI Governance Simplified: Unifying 70+ laws, regulations, and standards Into a Policy Suite
Organizations who are either AI developers or AI deployers are under growing legal liability risk from multiple sources:
- National laws like Title VII of the Civil Rights act and Titles I and V of the ADA
- State laws like Virginia HV 747, Colorado SB24-205, and California SB 942
- Local laws like NYC 144
- Regulatory rules like the ACA 1557 and HHS HTI-1
- Enforceable guidance from regulators like the FDA
- Diverse state legislation on privacy protections, deepfakes, and disallowed uses
- Industry standards like the NIST AI RMF and ISO 42011, which are beginning to be references in court proceedings as representing ‘commercially reasonable efforts’
- International laws like the EU AI Act or Canada’s AIDA which apply to their citizens
This webinar introduces the AI Policy Suite by Pacific AI, which is a unified set of actionable policies that organizations can adopt, which enable compliance with 70+ AI laws, regulations, and standards.
These policies are updated on a quarterly basis which:
- Eliminates the overhead of staying up to date with all legislative and regulatory changes
- Translates legal requirements into actionable controls and policies
- De-duplicates the often overlapping requirements from different sources
The policies are available for free, to accelerate adoption and community feedback. Join this webinar to understand the current landscape in AI governance and understand what steps you can take to ensure compliance avoid legal, financial, and reputation risks.

David Talby is a CEO at Pacific AI and John Snow Labs, helping healthcare & life science companies put AI to good use. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a PhD in computer science and master’s degrees in both computer science and business administration.

Maria is a Lead Legal Counsel at John Snow Labs and Pacific AI. She is an experienced IT Attorney specializing in Legal AI and AI Governance. Maria has advanced degrees in International Private Law and International Property Law, as well as certifications in Digital Transformation and LegalTech.
Automated Testing of Bias, Fairness, and Robustness of Generative AI Solutions
Current US legislation prohibits AI applications in recruiting, healthcare, and advertising from discrimination and bias.
This requires organizations who deploy such systems to test and prove that their solutions are robust and unbiased – in the same way that they’re required to comply with security and privacy regulations. This session introduces Pacific AI, a no-code tool built on top of the LangTest library, which applies Generative AI to:
- Automatically generate tests for accuracy, robustness, bias, and fairness for text classification and entity recognition tasks
- Automatically run test suite, create detailed model report cards, and compare different models against the same test suite
- Publish, share, and reuse AI test suites across teams and projects
- Automatically generate synthetic training data to augment model training and minimize common model bias and reliability issues
This session then presents how John Snow Labs uses Pacific AI to test and improve its own healthcare-specific language models.

Jessica is the product manager for generative AI lab, the no -code UI tool designed to allow subject matter experts to pre-annotate, train, and test their own AI models.
Jessica has been involved in AI for nearly three years, originally brought into space for annotation and quickly moving into product development. Coming from a small startup, she’s had plenty of experience working with sales, customer support, and marketing to create a product that works for as many customers as possible.
Jessica is passionate about AI and its many ethical obligations, including industries such as healthcare, finance, and engineering.