Table of contents
Skip table of contents- What that means in practice
- Speech Synthesis: Shouting from clouds
- Speech Recognition: Guess where your voice goes
- Passkeys: a web API built on the browser password manager
- Payment Request API: Partnered wallets only
- Web Push: three different push networks
- AI APIs
- What Should We Do?
- The web we think we have vs. The web we actually have
If you want to be kept up to date with new articles, CSS resources and tools, join our newsletter.
When we talk about "the web platform", we often treat it as a unified, standards-based system: browsers implement features from the same specifications, even if they do so in different time frames. That should mean that when they're implemented, we can rely on APIs to work the same across browsers, and definitely across browsers using the same rendering engine.
But dig a little deeper, and you'll find that many APIs we rely on aren't actually part of "the web" at all. They're standardized interfaces, but with browser-specific implementations that depend on third-party services, proprietary systems, or vendor-specific infrastructure.
What that means in practice
For example, to use the Geolocation API in Polypane you need to add a Google API key. The same API works fine in Chrome without any configuration, so are we just being annoying?
Actually, the Geolocation API in Chromium uses Google's online geolocation services. While Google Chrome obviously gets to use Google's services for free, other browsers don't.
When we tell people this they're often surprised. Surely web standards wouldn't rely on proprietary services?
Yet they do, and when you think about it, that's even a little insidious. No where does the browser tell you that by using the Geolocation API, you're potentially sending your users' location data to Google (or Apple, depending on the browser).
So let's go over these APIs that are available in (some) browsers, but aren't actually part of the web platform in the way you and I think about 'the web'.
The illusion of standardization
Let's zoom in on that Geolocation API. It looks like a clean, standardized web API:
navigator.geolocation.getCurrentPosition(
(position) => {
console.log(position.coords.latitude, position.coords.longitude);
},
(error) => {
console.error('Geolocation failed:', error);
}
);The interface is standardized. The specification is maintained by the W3C. The docs on MDN are excellent. But the actual implementation? That's entirely up to the browser vendor.
Where does your location actually come from?
Browsers don't magically know where you are. Behind that clean API, they're making decisions about data sources:
- OS service/GPS: On mobile devices, accessing the operating system's location services (often using GPS).
- Wi-Fi positioning: Your browser sends nearby Wi-Fi access point MAC addresses to a location service provider. They're hashed first, but still.
- IP geolocation: Querying third-party databases that map IP addresses to approximate locations.
In the best case scenario, the browsers calls out to the Operating System's location services, which uses GPS to determine your location. But not all devices have GPS and not all operating systems provide that API. So even when the OS has location services, it still might be using third-party services. All browsers fall back to sending data to a location service provider in the absense of that OS API.
Chrome uses Google Location Services. Safari uses Apple's servers. Firefox had something called Mozilla Location Service but they retired it in 2024 and now also use Google's services.
This means the "web standard" geolocation depends on:
- Their integration and access to your device's OS location services
- The browser vendor's relationship with location service providers
- The quality and coverage of their Wi-Fi database
- Their privacy policies around location data
- Whether their servers are operational and accessible in your user's region
A PWA relying on precise geolocation might work flawlessly during Chrome testing but fail for users in regions where Google services are blocked.
Speech Synthesis: Shouting from clouds
The Web Speech API seems straightforward, and pretty cool:
const utterance = new SpeechSynthesisUtterance('Hello, world!');
utterance.voice = speechSynthesis.getVoices()[0];
speechSynthesis.speak(utterance);But where do those voices come from? It varies wildly:
- Desktop browsers often use operating system TTS engines (which themselves vary by OS and version).
- Mobile browsers might use online services for better quality voices.
- Chrome can use both local and cloud-based voices depending on the selected voice and network availability.
- Safari primarily relies on system voices but quality varies significantly across iOS and macOS versions.
Whatever voices your user has available depends on the browser, the operating system, and the device, and you're not guaranteed that the voice you selected in testing will be available for all your users.
Some "good" voices only work when you're online because they're processed on vendor servers: your text is sent to a server, processes there and the audio is sent back.
That should give you some pause about what kind of information you're sending to those servers. Are you reading aloud a user's private data or private messages? You might be sending explicitly private data to third-party servers without realizing it.
The API is standardized. The implementation is anything but.
Speech Recognition: Guess where your voice goes
The flip side of speech synthesis is speech recognition:
const recognition = new SpeechRecognition();
recognition.continuous = true;
recognition.interimResults = true;
recognition.onresult = (event) => {
const transcript = event.results[0][0].transcript;
console.log('You said:', transcript);
};
recognition.start();The speechRecognition API is not available without an internet connection in Chrome, and that's a clue. Speech recognition in Chrome is actually powered by Google's cloud speech services. Your users' voice data is:
- Captured by the browser
- Sent to Google's servers
- Processed by Google's speech recognition models
- Returned as text
Safari uses Apple's speech recognition services. Edge might use Azure Cognitive Services. The API looks the same, but:
- Accuracy varies based on which vendor's models you're using.
- Language support differs dramatically between providers.
- Privacy implications: some vendors explicitly use your users' voice data to improve their models.
- Offline capability only exists where vendors provide on-device models (basically iOS/macOS).
This is streaming audio of your users' voices to vendor servers in real-time. The "web API" is really just a JavaScript wrapper around vendor speech services.
Passkeys: a web API built on the browser password manager
The WebAuthn API promises passwordless authentication:
const credential = await navigator.credentials.create({
publicKey: {
challenge: new Uint8Array(32),
rp: { name: 'Example Corp', id: 'example.com' },
user: {
id: new Uint8Array(16),
name: 'user@example.com',
displayName: 'User Name',
},
pubKeyCredParams: [{ alg: -7, type: 'public-key' }],
authenticatorSelection: {
authenticatorAttachment: 'platform',
requireResidentKey: true,
},
},
});The WebAuthn API is a standard, but passkeys only work because browsers have invested heavily in their password management infrastructure. When you create a passkey:
- Chrome stores it in Google Password Manager (synced via Google's cloud).
- Safari stores it in iCloud Keychain (synced via Apple's infrastructure).
- Edge stores it in Microsoft Account (synced via Microsoft's cloud).
Passkeys are not part of the web stack, they're part of each browsers password manager, and only work because browsers have hooked up the APIs to their own password managers. The specification defines the API surface, but the actual credential storage, synchronization, and recovery mechanisms are entirely vendor-specific.
You're not just using a web standard, you're integrating with your browsers password manager (and full authentication ecosystem).
WebAuthn being tied to that built-in password management means Polypane doesn't support passkeys natively. We could build a password manager, but Electron currently has no way of hooking the Passkey APIs into a password manager backend. The workaround here is to use a password manager extension with Passkeys support like 1Password.
Payment Request API: Partnered wallets only
The Payment Request API looks like it standardizes online payments:
// set up a payment
const request = new PaymentRequest(
[
{
supportedMethods: 'https://google.com/pay',
data: { merchantId: '12345' },
},
],
{
total: {
label: 'Total',
amount: { currency: 'USD', value: '29.99' },
},
}
);
// show the payment UI
const response = await request.show();But which payment methods work depends entirely on browser vendor partnerships:
- Chrome: Deep integration with Google Pay, supports other methods via partnerships.
- Safari: Primarily Apple Pay, with limited support for other methods.
- Edge: Microsoft integrations, but often defers to system capabilities.
- Firefox: no support.
The supportedMethods field looks like a standard way to declare payment options, but in practice:
- You need separate integration contracts with each payment provider.
- Browser support for payment methods is often region-specific.
- The UX varies wildly: Safari shows a native Apple Pay sheet, Chrome shows its own UI.
- Some browsers require additional user setup (like adding cards to Apple Wallet).
You're not building on a payment standard. You're building adapters for each browser's preferred payment partners. Which, for the big three browsers, happen to be... themselves. That's convenient.
Web Push: three different push networks
The Push API promises a standard way to send notifications:
const registration = await navigator.serviceWorker.register('/sw.js');
const subscription = await registration.pushManager.subscribe({
userVisibleOnly: true,
applicationServerKey: urlBase64ToUint8Array(publicKey),
});But when you send that push notification, where does it actually go?
- Chrome/Edge: Firebase Cloud Messaging (FCM) - Google's infrastructure.
- Safari: Apple Push Notification service (APNs) - Apple's infrastructure.
- Firefox: Mozilla Push Service - Mozilla's infrastructure (or FCM on Android).
Each service has different:
- Rate limits: FCM allows 10,000 notifications per second per sender, but others have different limits
- Message size limits: FCM allows 4KB, APNs allows 5KB for modern devices
- Delivery guarantees: Each service handles offline delivery differently
- Privacy models: Each service has different data retention and encryption practices
Your push notification doesn't go directly to the user. Instead it routes through the browser vendor's servers first. If Google's FCM has an outage, Chrome push notifications stop working. If your server is in a region where Apple services are restricted, Safari push won't work for your users.
You're not just implementing push notifications. You're integrating with cloud infrastructure.
Media Source API and Widevine: codec politics and DRM dependencies
MSE lets you build custom video players:
const video = document.getElementById('video');
const mediaSource = new MediaSource();
video.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', () => {
URL.revokeObjectURL(video.src);
const sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.42E01E"');
// fetch and append video segments here (not shown for brevity)
});But which codecs are actually supported? That depends on:
- Patent licensing: Browsers ship codecs they're willing to pay for or that are royalty-free
- Platform support: Some codecs only work if the OS provides hardware decoding
- Vendor strategy: Google pushes VP9/AV1, Apple prefers H.264/HEVC, Mozilla champions open codecs
Safari has limited VP9 support. Chrome's HEVC support depends on hardware availability. Your "standard" video playback depends on vendor decisions about codec licensing, hardware partnerships, and competitive strategy.
So the API is standard, but you're either going to use the lowest-common-denominator codec, or build a complex system that encodes videos in multiple formats and serves the right one to each browser.
The Widevine Problem
Then there's DRM. The Encrypted Media Extensions (EME) API looks standard:
const config = [
{
initDataTypes: ['cenc'],
videoCapabilities: [
{
contentType: 'video/mp4; codecs="avc1.42E01E"',
},
],
},
];
navigator
.requestMediaKeySystemAccess('com.widevine.alpha', config)
.then((access) => access.createMediaKeys())
.then((mediaKeys) => video.setMediaKeys(mediaKeys));
// start a video playback session here (not shown for brevity)But Widevine isn't a web standard, it's Google's proprietary DRM system. And it's not universally available:
- Chrome, Edge, Firefox: Include Widevine by default (through licensing agreements with Google).
- Safari: Uses Apple's FairPlay DRM instead, requiring a completely different integration.
- Smaller browsers: Often don't have Widevine at all due to licensing costs, technical, legal and business requirements. The licensing barriers are often prohibitive.
- Polypane: Supports Widevine (mostly), but we too are at the mercy of whether a Widevine module is available for the OS and hardware you use, as well as it being available in the version of Electron we use.
Streaming services like Netflix, Disney+, and Spotify require not just codec support, but specific DRM and won't work without it.
The way this is currently set up strongly disadvantages smaller browsers and enforces vendor lock-in.
AI APIs
Recently, browsers (Chrome) have started exposing "AI" capabilities directly through web APIs. Google has a number of these AI-backed APIs in development, like summarization, translation, prompting, writing, and proofreading. They have some impressive demos, and clever people working on them.
// Summarization
const summarizer = await Summarizer.create({
type: 'key-points',
expectedInputLanguages: ['en'],
outputLanguage: 'en',
});
const longText = document.querySelector('article').innerHTML;
const summary = await summarizer.summarize(longText, {
context: 'This article is intended for front-end developers.',
});
// Translation
const translator = await Translator.create({
sourceLanguage: 'en',
targetLanguage: 'es',
});
const translated = await translator.translate('Hello world');The above APIs are not 'web standards' like the previous examples: they're not standardized and they're not available in other browsers. They're an experimental proposal by the Chrome team with the intent of standardizing them later.
The AI features in Chrome are only available in Chrome. They rely on a "small language model" called Gemini Nano, that is downloaded to your device when you first use it (...provided you have enough disk space and a good enough GPU). The model then runs locally on your device.
That's pretty cool, because it means you're not sending your user's data to a server where it can be logged, stored, used for training or sold to third parties. Your users' data stays on their device.
Unfortunately, "small" is only in comparison to the terabyte-scale models that power services like ChatGPT or Claude. Gemini Nano is still 4GB (varies per language) that needs to be downloaded by your user and requires significant compute resources to run effectively. It's also fully proprietary. Only Google gets to use it.
A small model running on a users device means that the same API can be used with a different model in a different browser. Theoretically that's cool.
However, while Microsoft will have no problem to train their own small Copilot model, smaller companies simply don't have the same resources to scrape, steal, build, train and maintain LLM models and pay for the infrastructure to host them and each of their users to download them.
That's simply something smaller browsers and open source projects can't compete with. That's not the web, it's proprietary lock-in by another name.
Why This Matters
These APIs blur the line between web standards and browser features in ways that have real consequences:
Portability Illusions
Code that works perfectly in Chrome might fail silently in Safari, not because of a bug or because you failed to check for availability, but because the underlying service infrastructure is different. Your geolocation might be accurate in one browser and wildly off in another, even on the same device for the same user.
Privacy Surprises
Your users might not realise that the feature you offer is sending their data to vendor servers. The Speech Recognition API feels local, but you're streaming audio to Google or Apple. Geolocation phones home to different companies depending on the browser. Web Push routes through vendor networks.
Web standards become a way for big players to keep out smaller competitors
The web is a level playing field. The web is open. The web is for everyone.
It's a nice idea.
But with the above APIs, the web is not a level playing field. Big players have the resources to build and maintain the services these APIs depend on. They can afford to offer them for free to their users because they have other ways of monetizing them (ads, ecosystem lock-in, data collection).
Standardization should guard against this kind of vendor lock-in. But instead, web specifications are mostly written by the people working for these big players because these companies can afford to employ people to work on specs full-time and employ people to implement the specifications. And when there's implementations, specifications become standards.
The W3C has guards to prevent abuse, but in practice the process is weighed towards the wishes of large companies. The specification process is a way for the big players to entrench their dominance.
Smaller browsers and open source projects either need to rent access to the services that Google, Apple and Microsoft can afford to offer for free or deal with explaining to users why their browser can't offer the same features. This creates feature parity gaps that appear as browser bugs to end users.
What this means for developers
The above APIs are really useful. I'm not saying browsers are bad for having more capabilities. These are features we want to offer to our users, and having them "build right into the browser" feels great.
But we need to understand that these APIs are not truly portable web standards, and understand the consequences of using them for our users.
They're abstractions over vendor services, and we should treat them as such. We sign data processing agreements with all our vendors to keep our users data safe. Why does the browser get a pass?
What Should We Do?
As developers, we need to be aware of these realities: the APIs you choose to use come with consequences.
Treat these APIs as abstractions over vendor services, not as truly portable web standards. Document which browsers you've tested with and what the fallback behavior is. Consider feature detection not just for API availability, but for quality and behavior.
Design for graceful degradation. Don't assume geolocation will be accurate, passkeys will sync, speech will sound the same everywhere, or push notifications will be delivered reliably. Build fallbacks.
Be transparent about privacy. If an API might send data to third-party servers, your users deserve to know. Especially if the browser abstracts it away. Document which vendor services your app depends on.
Consider the vendor relationship. When you use these APIs, you're implicitly depending on each browser vendor's ongoing commitment to maintaining their backend infrastructure. What happens if a vendor decides to deprecate a service or changes their pricing model?
This isn't just theoretical: There are vast swaths of web pages with broken Google Maps embeds because Google changed the deal and started requiring API keys (with usage limits and billing) for them.
Test across browsers and regions. That speech recognition that's accurate for you might not support your users' languages.
Plan for vendor lock-in. If you build heavily on one of these features, you're making a strategic choice to depend on that vendor's platform. That might be fine, but make it a conscious decision.
The web we think we have vs. The web we actually have
The promise of web standards is that you write code once and it works everywhere. But the above APIs are really thin wrappers around vendor-specific services.
The interface is standardized, but the implementation, its dependencies, limitations, and privacy implications are not.
When you call navigator.geolocation.getCurrentPosition(), you're not just using a web API. You're using Google Location Services or Apple's location servers. When you send a push notification, you're routing through FCM or APNs or Mozilla's infrastructure. When you use speech recognition, you're streaming audio to vendor cloud services.
The specification says what the function call looks like. But specifications don't tell you:
- Where your data is sent to,
- How accurate or reliable the results will be,
- Whether it works in your users' regions,
- What the privacy implications are,
- Whether it will still work for free (or at all) next year.
This doesn't make these APIs bad or unusable. It just means we need to design our applications with a clearer understanding of where the standardization ends and the vendor specifics begin.
The web platform is powerful, but it's also more fractured than we often acknowledge. And that's worth keeping in mind every time we reach for an API that looks standard but depends on infrastructure that isn't.

