Optimizing Content for the Assistant in Android

Android 6.0 Marshmallow introduces a new way for users to engage with apps through the assistant.

Users summon the assistant with a long-press on the Home button or by saying the keyphrase. In response to the long-press, the system opens a top-level window that displays contextually relevant actions for the current activity. These potential actions might include deep links to other apps on the device.

This guide explains how Android apps use Android’s Assist API to improve the assistant user experience.

Using the Assist API

The example below shows how Google Now integrates with the Android assistant using a feature called Now on Tap.

The assistant overlay window in our example (2, 3) is implemented by Google Now through a feature called Now on Tap, which works in concert with the Android platform-level functionality. The system allows the user to select the assistant app (Figure 2) that obtains contextual information from thesource app using the Assist API which is a part of the platform.

Figure 1. Assistant interaction example with the Now on Tap feature of Google Now

An Android user first configures the assistant and can change system options such as using text and view hierarchy as well as the screenshot of the current screen (Figure 2).

From there, the assistant receives the information only when the user activates assistance, such as when they tap and hold the Home button ( shown in Figure 1, step 1).

Figure 2. Assist & voice input settings (Settings/Apps/Default Apps/Assist & voice input)

Assist API Lifecycle

Going back to our example from Figure 1, the Assist API callbacks are invoked in the source app after step 1 (user long-presses the Home button) and before step 2 (the assistant renders the overlay window). Once the user selects the action to perform (step 3), the assistant executes it, for example by firing an intent with a deep link to the (destination) restaurant app (step 4).

Source App

In most cases, your app does not need to do anything extra to integrate with the assistant if you already follow accessibility best practices. This section describes how to provide additional information to help improve the assistant user experience, as well as scenarios, such as custom Views, that need special handling.

Share Additional Information with the Assistant

In addition to the text and the screenshot, your app can share additional information with the assistant. For example, your music app can choose to pass current album information, so that the assistant can suggest smarter actions tailored to the current activity.

To provide additional information to the assistant, your app provides global application context by registering an app listener and supplies activity-specific information with activity callbacks as shown in Figure 3.

Figure 3. Assist API lifecycle sequence diagram.

To provide global application context, the app creates an implementation of Application.OnProvideAssistDataListener and registers it usingregisterOnProvideAssistDataListener(android.app.Application.OnProvideAssistDataListener). In order to provide activity-specific contextual information, activity overrides onProvideAssistData(android.os.Bundle) and onProvideAssistContent(android.app.assist.AssistContent). The two activity methods are called after the optional global callback (registered withregisterOnProvideAssistDataListener(android.app.Application.OnProvideAssistDataListener)) is invoked. Since the callbacks execute on the main thread, they should complete promptly. The callbacks are invoked only when the activity is running.

Providing Context

onProvideAssistData(android.os.Bundle) is called when the user is requesting the assistant to build a full ACTION_ASSIST Intent with all of the context of the current application represented as an instance of the AssistStructure. You can override this method to place into the bundle anything you would like to appear in the EXTRA_ASSIST_CONTEXT part of the assist Intent.

Describing Content

Your app can implement onProvideAssistContent(android.app.assist.AssistContent) to improve assistant user experience by providing references to content related to the current activity. You can describe the app content using the common vocabulary defined by Schema.org through a JSON-LD object. In the example below, a music app provides structured data to describe the music album the user is currently looking at.

public void onProvideAssistContent(AssistContent assistContent) {

  String structuredJson = new JSONObject()
       .put("@type", "MusicRecording")
       .put("@id", "https://example.com/music/recording")
       .put("name", "Album Title")


Custom implementations of onProvideAssistContent(android.app.assist.AssistContent) may also adjust the provided content intent to better reflect the top-level context of the activity, supply the URI of the displayed content, and fill in its setClipData(android.content.ClipData) with additional content of interest that the user is currently viewing.

Default Implementation

If neither onProvideAssistData(android.os.Bundle) nor onProvideAssistContent(android.app.assist.AssistContent) callbacks are implemented, the system will still proceed and pass the information collected automatically to the assistant unless the current window is flagged as secure. As shown in Figure 3, the system uses the default implementations of onProvideStructure(android.view.ViewStructure) andonProvideVirtualStructure(android.view.ViewStructure) to collect text and view hierarchy information. If your view implements custom text drawing, you should override onProvideStructure(android.view.ViewStructure) to provide the assistant with the text shown to the user by callingsetText(java.lang.CharSequence).

In most cases, implementing accessibility support will enable the assistant to obtain the information it needs. This includes providingandroid:contentDescription attributes, populating AccessibilityNodeInfo for custom views, making sure custom ViewGroups correctly expose their children, and following the best practices described in “Making Applications Accessible”.

Excluding views from the assistant

An activity can exclude the current view from the assistant. This is accomplished by setting the FLAG_SECURE layout parameter of the WindowManager and must be done explicitly for every window created by the activity, including Dialogs. Your app can also use SurfaceView.setSecure to exclude a surface from the assistant. There is no global (app-level) mechanism to exclude all views from the assistant. Note that FLAG_SECURE does not cause the Assist API callbacks to stop firing. The activity which uses FLAG_SECURE can still explicitly provide information to the assistant using the callbacks described earlier this guide.

Voice Interactions

Assist API callbacks are also invoked upon keyphrase detection. For more information see the voice actions documentation.

Z-order considerations

The assistant uses a lightweight overlay window displayed on top of the current activity. The assistant can be summoned by the user at any time. Therefore, apps should not create permanent system alert windows interfering with the overlay window shown in Figure 4.

Figure 4. Assist layer Z-order.

If your app uses system alert windows, it must promptly remove them as leaving them on the screen will degrade user experience and annoy the users.

Destination App

The matching between the current user context and potential actions displayed in the overlay window (shown in step 3 in Figure 1) is specific to the assistant’s implementation. However, consider adding deep linking support to your app. The assistant will typically take advantage of deep linking. For example, Google Now uses deep linking and App Indexing in order to drive traffic to destination apps.

Implementing your own assistant

Some developers may wish to implement their own assistant. As shown in Figure 2, the active assistant app can be selected by the Android user. The assistant app must provide an implementation of VoiceInteractionSessionService and VoiceInteractionSession as shown in this example and it requires the BIND_VOICE_INTERACTION permission. It can then receive the text and view hierarchy represented as an instance of the AssistStructure inonHandleAssist(). The assistant receives the screenshot through onHandleScreenshot().

Leave a Reply

Your email address will not be published. Required fields are marked *