Mobile Device APIs Security
From MemberWiki
This wiki page is one of several wiki pages for the Mobile TF work on Mobile Device APIs. Here are the pages:
- Mobile Device APIs initiative home page
- Mobile Device APIs Objectives
- Mobile Device APIs Use Cases
- Mobile Device APIs Requirements
- Mobile Device APIs Security
- Mobile Device APIs Due Diligence Industry Survey
- Mobile Device APIs initiative todo list
Proposed conceptual framework for Mobile Device API security
This section attempts to capture some of the early thoughts about how to address security issues when the Web Runtime is granted access to Mobile Device APIs.
Terminology
Here are definitions for terms used in this proposal:
- Web Browser - A web browser is an application that allows a user to access and view Web pages, and typically includes features such as bookmark management, backwards/forwards buttons, and history management. The web browser uses a large and complicated software module that we call the Web Runtime (see below) for parsing and rendering web pages.
- Web Runtime - The Web Runtime is a large and complicated software module that knows how to parse and execute browser formats (e.g., HTML and JavaScript), render Web content to a display surface, and process user interaction (e.g., via keypad or touch screen) with that content. The Web Runtime is obviously a major part of a Web Browser, but can be used by other applications. On mobile devices, the Web Runtime is often used as the presentation engine for Installed Widgets (see below) and Installed Applications (see below). On desktop computers, the Web Runtime is used for a variety of tasks where HTML rendering is required, such as for online help systems. Note that a given system might have multiple Web Runtimes installed on it, and that a Web Runtime might have extended functionality due to plugins, such as Google Gears or other plugins that render proprietary presentation formats.
- System - The term "system" represents the installed software that manages and controls the mobile device hardware, presents a user interface to the user, and has the ability to launch applications, such as a Web browser. In our conceptual model, the "system" includes the operating system, plus the Web Runtime as a software service that can be invoked by Installed Widgets and Installed Applications.
- Installed widgets - Installed widgets represent custom mini-applications that a user can add to their system. On a desktop Macintosh, installed widgets are installed into and are launched from the Apple Dashboard; on Windows Vista, the comparable feature is the Vista Sidebar. On mobile devices, installed widgets often are installed onto and are launched from the device home screen. Installed widgets typically communicate with a remote service, such as a news feed or stock service. Many installed widgets will use the Web Runtime for rendering its content.
- Installed Applications - Installed applications represent the convention notion of an executable program that is installed onto a system. The most obvious difference between installed widget and installed application is how they are launched. An installed widget is launched from the widget manager (e.g., Apple Dashboard) whereas an installed application is launched by other means (e.g., Start Menu on Windows). Historically, most installed applications were written in languages such as C, C++ or Java, but there is a recent trend towards authoring installed application using web technologies such as HTML and JavaScript.
- Platform API - An API exposed by the native platform of the device.
- Scriptable API - An API that exposes device functionality to scripts within web applications.
- Security framework - The set of security capabilities responsible for ensuring actions are performed in a controlled manner.
- Trusted subsystem - An implementation of a scriptable API, making use either of platform or other scriptable APIs, that can be relied upon to expose only the functionality defined in its own API, and no additional functinality as may be available in any of the underlying APIs.
Key principles and implications
Access control belongs in the "system"
In order to keep the conceptual framework workably simple and to enable rigorous security implementation, the strawman proposal for the conceptual framework is that access control logic belongs somewhere in the amorphous "system" black box. Ultimately, it is the responsibility of the "hardware provider" (which typically represents shared responsibility between a cellular network provider such as Vodafone, a device manufacturer such as Nokia, and a Web Runtime developer such as the WebKit open source project) to make sure that the phone includes appropriate system software such that the user has a secure platform for running applications and interacting with the Web. We expect that rigorous security implementation requires coordinated development efforts across all levels of the system, particularly the Web Runtime and the operating system.
INSERT A GRAPHIC THAT SHOWS:
Web Browser Installed Widgets Installed Applications
|S| | | |
|Y| --------------- Web Runtime --------------- |
|S| | |SECURITY
|T| ------------ Operating System ------------- |
|E| |
|M| Hardware
JavaScript logic can simply call the APIs
The implication on the JavaScript side of the proposal that security is implemented within the system is that all APIs are directly callable by JavaScript logic. If a call is made in a scenario where permission is not granted, then the API call will not succeed. (Note: In some scenarios, the JavaScript will choose to query about the availability of an API, or invoke a setup API that requests permission to use an API, before actually invoking the API.)
To illustrate with an example, one possible way that the APIs could be designed is for asynchronous results, such as the following: (Note: don't pay too much attention to the exact way the APIs are shown below)
function ErrorCallback(...) { ...}
function SuccessCallback(...) { ...}
OpenAjax.device.loc.getCurrentLocation(ErrorCallback,SuccessCallback);
Or via anonymous functions:
OpenAjax.device.loc.getCurrentLocation(
{...error handler logic...},
{...success handler logic...}
);
API extensibility
The framework supports the creation of APIs by any party, and does not depend on any specific central definition or standardisation process for the APIs themselves.
The implications of this are given below.
APIs, including both interfaces and implementations of those interfaces, are identified by URI Usual URI namespacing conventions can therefore be used to manage the namespace of API identifiers. Any centrally defined APIs can be defined within a URI belonging to the defining organisation, such as openajax.org, or omtp.org.
Any web application (whether a web page or installed widget) can explicitly declare the APIs it intends to use. This declaration can take the form of a static declaration (eg in a manifest for a widget package), or can be a programmatic declaration (eg openajax.loadAPI('http://api.openajax.org/mobile/contacts'); ). This declaration allows for:
- a check that the runtime supports the API in question;
- dynamic provisioning of the API, in implementations that support this;
- a permission check that the page in question is permitted to bind to that API. This delivers flexibility and usability benefits to the security framework as compared with simply enforcing access control at the point that APIs are called, or APIs attempt to perform security-relevant platform operations.
- the ability of a runtime to resolve, based on local circumstances or other specified parameters, which particular implementation or configuration of an API to load or bind to. (This also mirrors the approach increasingly followed by the toolkits and web API frameworks, wherein APIs are programmatically loaded rather than being explcitly and directly referenced by a SCRIPT tag.)
- (In the case of statically declared dependencies), the ability for a local widget manager to determine prior to installation whether or not a widget is capable of running on the target device.
- dealing with trusted subsystems (in implementations that support this). It is possible to define separate APIs, offering differing levels of access to the same resource, and selectively grant access at the API level.
Support for all Application Deployment models
The framework supports all of the application deployment models listed. Concretely, this means that:
- within the security framework, policies can be configured that depend both on URI identities and Signer identities;
- within an implementation, the mechanism for binding a page to an API (and applying any configured policy to the bind action) must be able to deal with both statically and dynamically declared dependencies.
Authentication of content separated from assignment of rights
Many mobile application security frameworks assign each application to a trust domain based on a classification of the installed root certificates on the device. An application would, for example, be assigned to an "operator" trust domain if it is signed, and the validation chain for its signature ends with a certificate that is classified as an "operator" certificate.
The trust domain in turn determines the rights given to the application.
In the proposed architecture, this specific dependency between establishing the authenticity of a web application and determining its rights is avoided. Instead, the rights assigned to any web application can be determined based on a configured policy that depends on the application identity as well as the root certificate by which that identity was verified.
Support for trusted subsystems
The security framework, responsible for enforcement of a configured security policy, by default performs a check on every attempt to call a scriptable API, and on every attempt by the implementation of that API to call an underlying platform API.
However, when the implementation of a scriptable API is known to be a trusted subsystem, an attempted call to a platform API from the trusted subsystem is permitted provided that that trusted subsystem is both authorised to access that API and trusted to not allow untrusted content to access the trusted subsystem to bypass the platform's security policy. Using this, scriptable APIs and trusted implementations can be created that selectively grant access to the resources exposed by a platform API in circumstances where full access to the platform API is not allowable.
Allow the industry to innovate around security policies and user experience
Because the whole field of allowing the Web Runtime access to device APIs is new to the industry, and because security is a complex subject with user experience implications, this strawman proposes that we do not attempt to dictate any particular security policies nor attempt to dictate any particular user interface requirements. For example, we should not attempt to dictate things such as "If a Web page wants to access the phone's current geographic location, the user must be prompted to approve this action", because this quoted sentence contains both an implied security policy (i.e., all Web pages must be prompted before they can access current location) and an implied user experience (i.e., the user must respond to a prompt). Instead, the industry very well might discover that it is both desirable and appropriate to allow some Web pages to access current location without requiring explicit user approval.
Therefore, instead of attempting to dictate what the industry must do about security, our proposal is to provide educational materials that explain the key considerations and provide examples of techniques that we believe address those considerations.
Framework definitions
User
The person who is operating the device.
Agent
A web application, either remotely hosted on a web server or locally installed as a widget, is considered to be an agent.
Various scenarios might involve a set of agents, particularly in the case of a composite application, such as mashup, where different components interacting with different remote services, in which case there might be a single primary domain along with a set of other domains that are involved in the application. (Note: Often, in the desktop Web, the browser's security policies are centered only on the primary domain.)
Agent identity
Two main identity types appear to be relevant:
- the most familiar model, for packaged applications or widgets that are installable, is the Distinguished Name (DN) of the code signing certificate associated with the signature on the package. The formalised security frameworks for many mobile environments, including MIDP/JavaME, bases an agent's rights on this identity. However, a DN identity may also be a relevant agent identity in other situations, such as where a signed script or web page is loaded from a website using the jar: protocol.
- the most natural model for remotely websites viewed in a browser is to treat the URI, or part of the URI (e.g., the domain), as the agent's identity. This usage is probably most familiar in the security configuration in IE, where websites are assigned to security zones based on URI (in fact, on the domain part of the URI, together with knowledge as to whether or not that domain has been verified using HTTPS). Given the domain-level "sandboxing" that is applied in the browser security model, the domain part of the URI is the part that can most naturally and reliably be considered to be the agent identity in this case. Again, this URI/domain identity type might also be relevant in other situations, such as where an unsigned installable package is downloaded from a given site.
Identity in multiple-origin applications
In multiple-origin application models, it is pertinent to ask whether or not there are multiple agents, or multiple agent identities in play.
The simplest kind of multiple-origin application is where a web page refers to a third-party script from another domain, and that script executes within the context of the web page. If the script attempts an action (such as a call to a particular device API), it is necessary to decide whether it is the identity (ie domain) of the containing page, or the domain of origin of the script (or both) that is relevant to deciding whether or not to permit the action. The conventional browser security model makes no distinction between the rights or capabilities of scripts from different origins once they have become within scope within a given page, so it is not possible in practice to rely on any identity in this case other than the identity (ie domain) of the containing page. However, where a page makes use of multiple frames, or some other technique that reliably separates code an markup from different origins, then it might be valid to treat the code directly attempting the action, and the referring page, as separate agents with independent identities for the purposes of a secuity policy.
Reliable separation of this kind might also be possible with other forms of multiple-origin applications (eg where code in a page calls a plugin, which in turn attempts a specific action)
Implied security requirements
A security system should include support for multiple systems of identity for agents, including at least Distinguished Name and URI identity types.
A security system should reliably determine the relevant agent and identity for any attempted action, including multiple-origin applications. (Add some comment here about trusted subsystems where there is a reliable way of separating the agent/identities involved in a sequence of events leading to an action.)
Action
Actions are uniquely identified by the combination of:
- a scheme identifier;
- an action identifier;
written as <scheme>:<action>
So, for example a particular scheme might be used to define a series of actions corresponding to platform API calls, and another scheme might be used to define actions relating to scriptable API calls. In WebVM, where the platform APIs correspond to (already defined) Java platform API calls, we use the scheme midp: and action identifers based on Java's own permission names. In this case, a fully qualified action identifier might be:
midp:javax.microedition.location.Location
Different platforms might have their own naming scheme for platform APIs and/or permissions. We might also define and ooa: scheme, and a series of specific actions within that scheme, together with a mapping to equivalent actions in other schemes.
Platform API
All low-level security-relevant platform operations, according to some naming scheme. In WebVM, these are named according to the Java package/class/API namespace, since these are the low-level platform operations that are mediated by the system.
The scheme identifier is midp.
An example of a fully qualified action identifier is midp:javax.microedition.location.read.
Other platforms may define their own platform API namespace and API names.
A standardised set of platform API names, using the openajax namespace id, could be defined, which simply become aliases for platform-specific names within given implementations.
Bind
This is the action of binding an agent to a scriptable API.
The fully qualified identifier is system:bind.
Scriptable API
This is the action of calling a function declared in a specific scriptable API.
The scheme identifier is script.
The general form of the action identifier is script:<scriptable API URI>!<function name>
Query
A security Query represents a specific attempt to perform an Action, and encapsulates the information on which the corresponding security decision is made. A Query refers to the (identifier of the) attempted Action.
Certain Queries, in addition to the Action, contain information relating to the context of the specific attempt. In the case of an attempt to open a file, for example, this context information might include the path and name of the file. The definition of each specific Action also defines whether or not there is any expected context information associated with that Action.
Ruleset
A Ruleset loosely corresponds to the idea of a security domain; that is, it can be used to model a collection of sites across which a particular set of permissions apply.
Each Ruleset includes at least one Rule and at least one Identity. A Ruleset exists to associate the Rules with their Identities.
Rule
A Rule indicates whether or not permission for a particular Action (such as binding to a library or sending an SMS) should be granted or denied. The Rule can given three outcomes:
- permission is Granted;
- permission is Denied;
- the check should be referred to the User. In this case, a security prompt should appear, allowing the user to make the decision.
In each Ruleset, a Rule can be created for any or all of the Actions. (The Rules will only apply to Agents whose URI/DN matches one of the Identities.)
In addition to referring to an Action, a Rule can also optionally include a Constraint, which is an additional qualifier that specifiers whether or not a given Query matches the Rule based on the context information. For example, a Rule governing the File Open action can specify a constraint limiting permission to files in a given directory.
Security configuration
A Security Configuration comprises a number of Rulesets.
Vulnerabilities
These are the set of known security vulnerabilities that might apply to various service requests. For example, allowing API access to the address book represents a privacy vulnerability. Allowing API access to the phone dialer represents a financial vulnerability because unapproved phone dialing might result in telephony charges to the user. Vulnerability might be assigned strength levels, such as distinguishing between a "privacy vulnerability" and a "serious privacy vulnerability".
Authorization methods
An operation must be authorized before it can occur. There are a variety of ways that authorization can be granted. The simplest is that the system grants universal access to particular operations. For example, perhaps universal access would be granted to access the current date and time. On the other extreme, other operations might require "root" access, where only the operating system itself and/or a system administrator (for desktop computers) can perform a particular operation. The more interesting area is between the two extremes, where authorization might be granted via a user prompt, or by virtue of a particular software program having a digital signature that can be verified by a particular authority, or because the software is known to come from a particular software suppliers that is deemed to be trustworthy.
Permission check procedure
Each Security Context (loosely corresponding to a particular scope within the execution of an application) is associated with a Security Configuration - that is, a collection of Rulesets.
Any attempted Action that arises in that context, giving rise to a corresponding Query, is checked against that Security Configuration, and allowed or declined, on the basis of the following procedure:
- each Ruleset is checked to determine whether or not the Security Context's Identity belongs to that Ruleset. If it does, the Rules in that Ruleset are added to the set of Applicable Rules. If not, no Rules in that Ruleset are considered further.
- each Rule in the set of Applicable Rules is checked to determine whether or not the Rule matches the Query. A Rule is considered to match if the Rule's Action matches the Query's Action, and if any context information in the Query matches any Constraint in the Rule. If the Rule matches the Query, the Rule's outcome is added to the set of applicable outcomes. If not, the Rule is not considered further. If there are no matching Rules, the Query is denied.
- the most restrictive outcome of all of the applicable outcomes is determined.
- If that outcome is Granted, the Action is allowed to proceed. If the outcode is Denied, the Action is not allowed to proceed and an error is returned.
- If the outcome is that the check should be referred to the user, an interactive prompt is presented that shows relevant details relating to the Security Context and Query, to which the user can Grant or Deny permission. The Action is allowed to proceed or not depending on the user's response.
What OpenAjax Alliance should do
The strawman proposal proposes that OpenAjax Alliance produce a white paper that describes:
- Industry standard metadata for each of the above 4 categories:
- How to describe the user and various agents that might participate in an attempt to access device APis, and in what ways their identities are established
- A breakdown of the kinds of actions (in particular, device APIs) that we can expect the industry will want to invoke from the Web Runtime
- A list of vulnerabilities that the industry needs to guard against
- A list of authorization methods
The white paper then would describe various scenarios which illustrate common scenarios where particular sets of agents attempt to access particular APIs, and then provide a list of some of the vulnerabilities that will occur, along with suggested approaches that the system provider might take to address those vulnerabilities, while still providing an intuitive and straightforward user experience for the user. For example, for the example where maps.google.com wants to access current device location, our write-up might talk about the need for the user to establish trust with maps.google.com to be informed of his current location, where we might highlight that there are privacy concerns, but probably not serious privacy concerns, and therefore suggest an approach where in our opinion it would be reasonable approach for the system to require that the user do a one-time approval (we would list some authorization methods that might be used) for the web site (either to google.com or the more restrictive maps.google.com) to access that information.
The goal of the white paper would not be to dictate the one and only solution to security problems; instead, the goal is simply to provide information that the industry will find useful.
The white paper would cover all 3 common uses of the Web Runtime (i.e., browser, installed widgets, and installed applications).
Other considerations
- Need to think about whether the security architecture needs to have hooks such that commercial security products, such as from Norton or McAfee, can plug in and offer more robust security services.
