Google’s MediaPipe unveils experimental LLM inference API for developers

The new inference application planning interface from MediaPipe allows web developers to run large language models (LLMs) completely on-device.

The application programming interface (API) can be used via web demo or by building sample demo apps using the provided software development kits (SDKs) for web, Android or iOS. Credit: Shutterstock/Jamie Jin.

Google‘s open source framework for ML, MediaPipe, launched a new LLM inference API yesterday (7 March).

The new experimental API allows LLMs to run entirely on-device across platforms, overcoming the memory and computing challenges associated with LLMs.

Access deeper industry intelligence

Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.

Find out more

TensorFlow Lite and MediaPipe have streamlined on-device ML for web developers since 2017, with MediaPipe supporting complete ML pipelines since 2019, Google said.

The new API supports web, Android and iOS and initially includes four openly available LLMs: Gemma, Phi 2, Falcon and Stable LM.

Android users can access the MediaPipe LLM inference API for experimental and research use, while production applications with LLMs on Android can use the Gemini API or Gemini Nano.

Google said the new API simplifies integration for web developers, allowing them to prototype and test openly available LLM on-device. It will also provide metrics like prefill and decode speed to measure an LLM’s performance.

GlobalData Strategic Intelligence

US Tariffs are shifting - will you react or anticipate?

Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.

By GlobalData

The API can be used via web demo or by building sample demo apps using the provided SDKs for web, Android or iOS.

MediaPipe plans to expand the LLM inference API, introducing more platforms, models, conversion tools, on-device components, high-level tasks and further optimisations in 2024.

In a survey conducted by GlobalData in Q4 of 2023, around 54% of businesses answered that AI had already begun to tangibly disrupt their industry, with a further 56% of respondents believing that AI would live up to all of its promises.

GenAI is also predicted to be the fastest growing segment of AI according to GlobalData’s research, with revenues expected to grow from $1.8bn to $33bn between 2022 and 2027.

Google’s MediaPipe unveils experimental LLM inference API for developers

Go deeper with GlobalData

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Doc.ai Inc. - Tech Innovator Profile

Data Insights

Access deeper industry intelligence

US Tariffs are shifting - will you react or anticipate?

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Doc.ai Inc. - Tech Innovator Profile

Go deeper with GlobalData

Technology M&A: AI infrastructure and regulation redefine the deal cycle

Boomi lays foundation for enterprises' agentic transformation

Microsoft Ignite 2025: Much rests beneath the surface

AI demand outpaces global infrastructure

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Data Insights

Access deeper industry intelligence

US Tariffs are shifting - will you react or anticipate?

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Go deeper with GlobalData

Access deeper industry intelligence

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing