Device Power Management Model
The following sections describe the details of the device power management model. This
model includes the following elements:
Power Management Components
A device is power manageable if the power consumption of the device can
be reduced when the device is idle. Conceptually, a power-manageable device consists of
a number of power-manageable hardware units that are called components.
The device driver notifies the system about device components and their associated
power levels. Accordingly, the driver creates a pm-components(9P) property in the driver's
attach(9E) entry point as part of driver initialization.
Most devices that are power manageable implement only a single component. An example
of a single-component, power-manageable device is a disk whose spindle motor can
be stopped to save power when the disk is idle.
If a device has multiple power-manageable units that are separately controllable, the device
should implement multiple components.
An example of a two-component, power-manageable device is a frame buffer card with
a monitor. Frame buffer electronics is the first component [component 0]. The frame
buffer's power consumption can be reduced when not in use. The monitor is
the second component [component 1]. The monitor can also enter a lower power
mode when the monitor is not in use. The frame buffer electronics and
monitor are considered by the system as one device with two components.
Multiple Power Management Components
To the power management framework, all components are considered equal and completely independent
of each other. If the component states are not completely compatible, the device
driver must ensure that undesirable state combinations do not occur. For example,
a frame buffer/monitor card has the following possible states: D0, D1, D2,
and D3. The monitor attached to the card has the following potential states:
On, Standby, Suspend, and Off. These states are not necessarily compatible
with each other. For example, if the monitor is On, then the frame
buffer must be at D0, that is, full on. If the frame buffer
driver gets a request to power up the monitor to On while the
frame buffer is at D3, the driver must call pm_raise_power(9F) to
bring the frame buffer up before setting the monitor On. System requests to lower
the power of the frame buffer while the monitor is On must be
refused by the driver.
Power Management States
Each component of a device can be in one of two states:
busy or idle. The device driver notifies the framework of changes in
the device state by calling pm_busy_component(9F) and pm_idle_component(9F). When components are
initially created, the components are considered idle.
Power Levels
From the pm-components property exported by the device, the Device Power Management framework
knows what power levels the device supports. Power-level values must be positive integers.
The interpretation of power levels is determined by the device driver writer.
Power levels must be listed in monotonically increasing order in the pm-components property. A
power level of 0 is interpreted by the framework to mean off.
When the framework must power up a device due to a dependency,
the framework sets each component at its highest power level.
The following example shows a pm-components entry from the .conf file of a
driver that implements a single power-managed component consisting of a disk spindle motor.
The disk spindle motor is component 0. The spindle motor supports two power
levels. These levels represent “stopped” and “spinning at full speed.”
Example 12-1 Sample pm-component Entry
pm-components="NAME=Spindle Motor", "0=Stopped", "1=Full Speed";
The following example shows how Example 12-1 could be implemented in the attach()
routine of the driver.
Example 12-2 attach(9E) Routine With pm-components Property
static char *pmcomps[] = {
"NAME=Spindle Motor",
"0=Stopped",
"1=Full Speed"
};
/* ... */
xxattach(dev_info_t *dip, ddi_attach_cmd_t cmd)
{
/* ... */
if (ddi_prop_update_string_array(DDI_DEV_T_NONE, dip,
"pm-components", &pmcomp[0],
sizeof (pmcomps) / sizeof (char *)) != DDI_PROP_SUCCESS)
goto failed;
/* ... */
The following example shows a frame buffer that implements two components. Component 0
is the frame buffer electronics that support four different power levels. Component 1
represents the state of power management of the attached monitor.
Example 12-3 Multiple Component pm-components Entry
pm-components="NAME=Frame Buffer", "0=Off", "1=Suspend", \
"2=Standby", "3=On",
"NAME=Monitor", "0=Off", "1=Suspend", "2=Standby", "3=On";
When a device driver is first attached, the framework does not know the
power level of the device. A power transition can occur when:
After a power transition, the framework begins tracking the power level of each
component of the device. Tracking also occurs if the driver has informed the
framework of the power level. The driver informs the framework of a
power level change by calling pm_power_has_changed(9F).
The system calculates a default threshold for each potential power transition. These thresholds
are based on the system idleness threshold. The default thresholds can be overridden
using pmconfig or power.conf(4). Another default threshold based on the system idleness threshold
is used when the component power level is unknown.
Power Management Dependencies
Some devices should be powered down only when other devices are also powered
down. For example, if a CD-ROM drive is allowed to power down,
necessary functions, such as the ability to eject a CD, might be lost.
To prevent a device from powering down independently, you can make that device
dependent on another device that is likely to remain powered on. Typically, a
device is made dependent upon a frame buffer, because a monitor is
generally on whenever a user is utilizing a system.
The power.conf(4)file specifies the dependencies among devices. (A parent node in the device
tree implicitly depends upon its children. This dependency is handled automatically by the
power management framework.) You can specify a particular dependency with a power.conf(4) entry
of this form:
device-dependency dependent-phys-path phys-path
Where dependent-phys-path is the device that is kept powered up, such as the
CD-ROM drive. phys-path represents the device whose power state is to be
depended on, such as the frame buffer.
Adding an entry to power.conf for every new device that is plugged
into the system would be burdensome. The following syntax enables you to indicate
dependency in a more general fashion:
device-dependency-property property phys-path
Such an entry mandates that any device that exports the property property must
be dependent upon the device named by phys-path. Because this dependency applies especially
to removable-media devices, /etc/power.conf includes the following line by default:
device_dependent-property removable-media /dev/fb
With this syntax, no device that exports the removable-media property can be
powered down unless the console frame buffer is also powered down.
For more information, see the power.conf(4) and removable-media(9P) man pages.
Automatic Power Management for Devices
If automatic power management is enabled by pmconfig or power.conf(4), then all devices
with a pm-components(9P) property automatically will use power management. After a component
has been idle for a default period, the component is automatically lowered to
the next lowest power level. The default period is calculated by the power
management framework to set the entire device to its lowest power state within
the system idleness threshold.
Note - By default, automatic power management is enabled on all SPARC desktop systems first
shipped after July 1, 1999. This feature is disabled by default for all
other systems. To determine whether automatic power management is enabled on your machine,
refer to the power.conf(4) man page for instructions.
power.conf(4) can be used to override the defaults calculated by the framework.
Device Power Management Interfaces
A device driver that supports a device with power-manageable components must create a
pm-components(9P) property. This property indicates to the system that the device has
power-manageable components. pm-components also tells the system which power levels are available. The
driver typically informs the system by calling ddi_prop_update_string_array(9F) from the driver's attach(9E) entry point. An
alternative means of informing the system is from a driver.conf(4) file.
See the pm-components(9P) man page for details.
Busy-Idle State Transitions
The driver must keep the framework informed of device state transitions from idle
to busy or busy to idle. Where these transitions happen is entirely device-specific.
The transitions between the busy and idle states depend on the nature of
the device and the abstraction represented by the specific component. For example, SCSI
disk target drivers typically export a single component, which represents whether the SCSI
target disk drive is spun up or not. The component is marked busy
whenever an outstanding request to the drive exists. The component is marked idle
when the last queued request finishes. Some components are created and never marked
busy. For example, components created by pm-components(9P) are created in an idle state.
The pm_busy_component(9F) and pm_idle_component(9F) interfaces notify the power management framework of busy-idle state
transitions. The pm_busy_component(9F) call has the following syntax:
int pm_busy_component(dev_info_t *dip, int component);
pm_busy_component(9F) marks component as busy. While the component is busy, that component should
not be powered off. If the component is already powered off, then marking
that component busy does not change the power level. The driver needs to
call pm_raise_power(9F) for this purpose. Calls to pm_busy_component(9F) are cumulative and
require a corresponding number of calls to pm_idle_component(9F) to idle the component.
The pm_idle_component(9F) routine has the following syntax:
int pm_idle_component(dev_info_t *dip, int component);
pm_idle_component(9F) marks component as idle. An idle component is subject to being powered
off. pm_idle_component(9F) must be called once for each call to pm_busy_component(9F) in order
to idle the component.
Device Power State Transitions
A device driver can call pm_raise_power(9F) to request that a component be set
to at least a given power level. Setting the power level in this
manner is necessary before using a component that has been powered off. For
example, the read(9E) routine of a SCSI disk target driver might need to
spin up the disk, if the disk has been powered off. The
pm_raise_power(9F) function requests the power management framework to initiate a device power state
transition to a higher power level. Normally, reductions in component power levels are
initiated by the framework. However, a device driver should call pm_lower_power(9F) when detaching,
in order to reduce the power consumption of unused devices as much as
possible.
Powering down can pose risks for some devices. For example, some tape drives
damage tapes when power is removed. Similarly, some disk drives have a limited
tolerance for power cycles, because each cycle results in a head landing. Use
the no-involuntary-power-cycles(9P) property to notify the system that the device driver should control
all power cycles for the device. This approach prevents power from being removed
from a device while the device driver is detached unless the device was
powered off by a driver's call to pm_lower_power(9F) from its detach(9E) entry point.
The pm_raise_power(9F) function is called when the driver discovers that a component needed
for some operation is at an insufficient power level. This interface causes the
driver to raise the current power level of the component to the needed
level. All the devices that depend on this device are also brought back
to full power by this call.
Call the pm_lower_power(9F) function when the device is detaching once access to the
device is no longer needed. Call pm_lower_power(9F) to set each component at the
lowest power so that the device uses as little power as possible
while not in use. The pm_lower_power() function must be called from the detach() entry
point. The pm_lower_power() function has no effect if it is called from any
other part of the driver.
The pm_power_has_changed(9F) function is called to notify the framework about a power transition.
The transition might be due to the device changing its own power level.
The transition might also be due to an operation such as suspend-resume. The
syntax for pm_power_has_changed(9F) is the same as the syntax for pm_raise_power(9F).
power() Entry Point
The power management framework uses the power(9E) entry point.
power() uses the following syntax:
int power(dev_info_t *dip, int component, int level);
When a component's power level needs to be changed, the system calls the
power(9E) entry point. The action taken by this entry point is device driver-specific.
In the example of the SCSI target disk driver mentioned previously, setting the power
level to 0 results in sending a SCSI command to spin down
the disk, while setting the power level to the full power level results
in sending a SCSI command to spin up the disk.
If a power transition can cause the device to lose state, the
driver must save any necessary state in memory for later restoration. If a
power transition requires the saved state to be restored before the device can
be used again, then the driver must restore that state. The framework makes
no assumptions about what power transactions cause the loss of state or require the
restoration of state for automatically power-managed devices. The following example shows a sample
power() routine.
Example 12-4 Using the power() Routine for a Single-Component Device
int
xxpower(dev_info_t *dip, int component, int level)
{
struct xxstate *xsp;
int instance;
instance = ddi_get_instance(dip);
xsp = ddi_get_soft_state(statep, instance);
/*
* Make sure the request is valid
*/
if (!xx_valid_power_level(component, level))
return (DDI_FAILURE);
mutex_enter(&xsp->mu);
/*
* If the device is busy, don't lower its power level
*/
if (xsp->xx_busy[component] &&
xsp->xx_power_level[component] > level) {
mutex_exit(&xsp->mu);
return (DDI_FAILURE);
}
if (xsp->xx_power_level[component] != level) {
/*
* device- and component-specific setting of power level
* goes here
*/
xsp->xx_power_level[component] = level;
}
mutex_exit(&xsp->mu);
return (DDI_SUCCESS);
}
The following example is a power() routine for a device with two components,
where component 0 must be on when component 1 is on.
Example 12-5 power(9E) Routine for Multiple-Component Device
int
xxpower(dev_info_t *dip, int component, int level)
{
struct xxstate *xsp;
int instance;
instance = ddi_get_instance(dip);
xsp = ddi_get_soft_state(statep, instance);
/*
* Make sure the request is valid
*/
if (!xx_valid_power_level(component, level))
return (DDI_FAILURE);
mutex_enter(&xsp->mu);
/*
* If the device is busy, don't lower its power level
*/
if (xsp->xx_busy[component] &&
xsp->xx_power_level[component] > level) {
mutex_exit(&xsp->mu);
return (DDI_FAILURE);
}
/*
* This code implements inter-component dependencies:
* If we are bringing up component 1 and component 0
* is off, we must bring component 0 up first, and if
* we are asked to shut down component 0 while component
* 1 is up we must refuse
*/
if (component == 1 && level > 0 && xsp->xx_power_level[0] == 0) {
xsp->xx_busy[0]++;
if (pm_busy_component(dip, 0) != DDI_SUCCESS) {
/*
* This can only happen if the args to
* pm_busy_component()
* are wrong, or pm-components property was not
* exported by the driver.
*/
xsp->xx_busy[0]--;
mutex_exit(&xsp->mu);
cmn_err(CE_WARN, "xxpower pm_busy_component()
failed");
return (DDI_FAILURE);
}
mutex_exit(&xsp->mu);
if (pm_raise_power(dip, 0, XX_FULL_POWER_0) != DDI_SUCCESS)
return (DDI_FAILURE);
mutex_enter(&xsp->mu);
}
if (component == 0 && level == 0 && xsp->xx_power_level[1] != 0) {
mutex_exit(&xsp->mu);
return (DDI_FAILURE);
}
if (xsp->xx_power_level[component] != level) {
/*
* device- and component-specific setting of power level
* goes here
*/
xsp->xx_power_level[component] = level;
}
mutex_exit(&xsp->mu);
return (DDI_SUCCESS);
}