Google‘s open source framework for ML, MediaPipe, launched a new LLM inference API yesterday (7 March).

The new experimental API allows LLMs to run entirely on-device across platforms, overcoming the memory and computing challenges associated with LLMs.

TensorFlow Lite and MediaPipe have streamlined on-device ML for web developers since 2017, with MediaPipe supporting complete ML pipelines since 2019, Google said.

The new API supports web, Android and iOS and initially includes four openly available LLMs: Gemma, Phi 2, Falcon and Stable LM.

Android users can access the MediaPipe LLM inference API for experimental and research use, while production applications with LLMs on Android can use the Gemini API or Gemini Nano.

Google said the new API simplifies integration for web developers, allowing them to prototype and test openly available LLM on-device. It will also provide metrics like prefill and decode speed to measure an LLM’s performance.

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

The API can be used via web demo or by building sample demo apps using the provided SDKs for web, Android or iOS.

MediaPipe plans to expand the LLM inference API, introducing more platforms, models, conversion tools, on-device components, high-level tasks and further optimisations in 2024.

In a survey conducted by GlobalData in Q4 of 2023, around 54% of businesses answered that AI had already begun to tangibly disrupt their industry, with a further 56% of respondents believing that AI would live up to all of its promises. 

GenAI is also predicted to be the fastest growing segment of AI according to GlobalData’s research, with revenues expected to grow from $1.8bn to $33bn between 2022 and 2027.