Contributing content types
Providing a new content type
The platform defines some fundamental content types, such as plain text and
XML. These content types are defined the same way as those contributed by any
other plug-ins. We will look at how the platform defines some of its content
types in order to better understand the content type framework.
Plug-ins define content types by contributing an extension for the extension
In this extension, a plug-in specifies a simple id and name for the content
type (the full id is always the simple id prefixed by the current namespace).
The following snippet shows a trimmed down version of the
content type contribution:
file-extensions attribute defines what file extensions are
associated with the content type (in this example, ".txt"). The
attribute (not used in this case) allows associating full names. Both attributes
are taken into account by the platform when performing content type detection
and description (if the client provides a file name).
describer element is used to define a content describer
for the content type.
Detecting and describing content
A content type should provide a content describer if there are any identifiable
characteristics that allow automatic content type detection, or any interesting
properties in data belonging to the content type. In the case of
it is not possible to figure out the content type by just looking at the contents.
However, text streams might be prepended by a byte order mark, which
is a property clients might be interested in knowing about, so this warrants
a content describer.
The describer is an implementation of
The latter is a specialization of the former that must be implemented by describers
of text-oriented content types. Regardless the nature of the content type, the
describer has two responsibilities: helping determining whether its content
type is appropriate for a given data stream, and extracting interesting properties
from a data stream that supposedly belongs to its content type.
The method describe(stream, description) is called whenever the platform
is trying to determine the content type for a particular data stream or describe
its contents. The description is
null when only detection is requested.
Otherwise, the describer should try to fill the content description with any
properties that could be found by reading the stream, and only those.
The content type markup should be used to declare any properties that have default
values (for example,
org.eclipse.core.runtime.xml declares UTF-8
as the default charset).
When performing its duty, the content describer is expected to execute as quickly
as possible. The less the data stream has to be read, the better. Also, it
is expected that the content describer implementation be declared in a package
that is exempt from plug-in activation (see the
bundle manifest header). Since all describers are instantiated when the content
type framework is initialized, failure in complying with this requirement causes
premature activation, which must be avoided. Future implementations of the platform
might refuse to instantiate describers if doing so would trigger activation
of the corresponding plug-in.
Extending an existing content type
Content types are hierarchical in nature. This allows new content types to
leverage the attributes or behavior of more general content types. For example,
a content type for XML data is considered a child of the text content type:
<property name="charset" default="UTF-8"/>
A XML file is deemed a kind of text file, so any features applicable to the
latter should be applicable to the former as well.
Note that the XML content type overrides several content type attributes originally
defined in the Text content type such as the file associations and the describer
implementation. Also, this content type declares a default property value for
charset property. That means that during content description for
a data stream considered as belonging to the XML content type, if the describer
does not fill in the charset property, the platform will set it to be "UTF-8".
As another example, the
type (for Ant Build Scripts) extends the XML content type:
Note that the default value for the charset property is inherited. It is possible
to cancel an inherited property or describer by redeclaring them with the empty
string as value.
Additional file associations
New file associations can be added to existing content types. For instance,
the Resources plug-in associates the
to ".project" files:
<file-association content-type="org.eclipse.core.runtime.xml" file-names=".project"/>
Content type aliasing
Due to the extensible nature of Eclipse, a content type a plug-in rely on may
not be available in a given product configuration. This can be worked around
by using content type aliasing. A content type alias is a placeholder
for another preferred content type whose availability is not guaranteed. For
instance, the Runtime declares an alias (
for the Java properties content type provided by the Java development tools (JDT)
<!-- a placeholder for setups where JDT's official type is not available -->
<property name="charset" default="ISO-8859-1"/>
This provides plug-ins with a placeholder they can refer to regardless the
preferred content type is available or not. If it is, the alias content type
is supressed from the content type catalog and any references to it are interpreted
as references to the target content type. If it is not, the alias will be used
as an ordinary content type.