This section describes basic concepts, but does not explain the corresponding APIs - the APIs are described in How to use the RF's Client API. It explains the metaphors used within the RF when dealing with documents and business objects on an abstract level.
A lot of the concepts described here are similar to, but not the same as, those of [WebDAV].
Table of Contents
Basic Aspects - Namespace, Content, and Properties
Creating, Copying, and Deleting
Advanced Aspects - Ordering, Property Search, ID Mapping, Locking, Security, Events
Ordered Collections and Collators
Expert Aspects - Semantic Objects and Versioning
Advanced Versioning: Working Resources
Advanced Versioning: Workspaces
Advanced Versioning: Auto Versioning and Automatic Version Control
Advanced Versioning: Version Controlled Collections
API Layers, Current and New API
Aspects of RF objects in the API
Supported Options, Read-Only and "Mutable" Interfaces
The basic aspects concern objects handled by the RF, how they are accessed, and what data they contain or provide.
Although the RF's main task is now to provide unified access to all kind of documents and business objects, it might be a little bit interesting to see how the RF has evolved through time.
Originally, the RF was intended as a document access layer. The basic concepts were very similar to those of WebDAV, as described in the relevant RFCs (see
[RFC2291], [RFC2518], and [RFC3253] for more details about WebDAV). Its main task was to unify the attributes of documents, such as author or creation date.
Unlike JNDI, where only naming is unified and all objects are treated as java.lang.Objects, some basic attributes of the documents were considered
to be so essential that they became part of the RF's unified objects (for example, display name). The RF therefore requires and provides more knowledge about
its objects than JNDI does.
This introduced the need for a distinction between content (unstructured data, the internal structure of which is not known to the RF), and (meta-) attributes (which are structured data and can be inspected by the RF) such as content size, author, or creation date.
Furthermore, it turned out that even business objects modelled as plain java.lang.Objects are not sufficient for unified access, since too much
knowledge about those business objects had to be incorporated into the applications, thereby preventing generic functions from being implemented at
framework level.
The RF was therefore extended to support business objects as well as documents. To achieve this, "type casting" was introduced in order to convert RF objects into business objects and back again (see Type Handling; for details).
The characteristics of the RF's unified objects (resources and collections) are:
The term "resource" is used here in two ways: It refers to an RF object and the most basic unified aspect of an RF object. Most of the time it should be obvious from the context whether "resource" refers to the entire object or just the unified aspect. Sometimes "plain resource" is used to distinguish a resource that is not a collection (a "leaf resource") from a collection (a "node resource").
A "collection" refers to the aspect of being able to have children of RF objects. Other unified aspects of RF objects will be introduced in the following sections.
On the other hand, "semantic object" refers to the non-unified aspect of an RF object.
For readers unfamiliar with WebDAV, the easiest approach to RF objects (resources) is to consider them as files and directories in a file system.
Resources are stored in and retrieved from the repository they belong to. A repository can be seen as a mount point in a Unix file system, but unlike a mount point, repositories can not be "mounted" anywhere in the hierarchy - they can only be "mounted" beneath the RF's root node.
Like files, resources have a unique, hierarchical name (see RF Objects), and might have content and additional attributes. However, unlike a file system, nodes might also have content (in a file system, directories do not have content). Most applications and repositories are not able to deal with this feature anyway.
A special feature of resources in the RF that distinguishes them from WebDAV resources or files is that they can be "casted" onto other objects. This is where the "file system" view is a little bit too document-centric for some applications. It is convenient for WebDAV-like scenarios, but it might not be sufficient for business objects. It might be easier for a business-object scenario to consider RF objects (resources) as database entries: The RID is the primary key, and the property values are similar to the column values of a result table. While the property names are the column titles, the content can be regarded as one special BLOB column (Note: the RID is not really a primary key, since it might change, although it does identify the object uniquely - see RF Objects and Unique ID on resource IDs). Remember that as with the "file system" metaphor, the "database" metaphor highlights some features of the RF but disregards others.
see also: How to use the RF's Client API
As mentioned, each unified RF object (resource) is identified by its unique, hierarchical ID. This Resource IDentifier (or RID for short) is something like a hybrid between a relative URI (the Unified Resource Identifier, as defined in [RFC2396], but without a "net_path" component and which is never escaped) and a file name (but with a URI-like "query" component).
More formally:
<RID> ::= empty | <pathsegment> [ '?' <query> ]
<pathsegment> ::= <name> [ '/' <pathsegment> ]
<name> ::= any text, not containing '/' and '?'
<query> ::= empty | <parameter> [ '&' <query> ]
<parameter> ::= <varname> [ '=' <value> ]
<varname> ::= any text, not containing '=' and '&'
<value> ::= any text, not containing '&'
The root resource (denoted by '/') is reserved for the RF itself. This folder contains all the repositories known in the RF, and all
these repositories have their own unique ID (called "repository prefix"). All the top-level resources are in turn root resources of the repositories,
for example, '/etc' is the RID of the top-level folder of the etc-repository (which contains some system data).
Beyond such a top-level resource (a repository's root resource), it is up to the particular repository to decide how the RIDs are built and hierarchically organized.
For example, if a file server share (\\server\share) is configured within the RF as a repository with the prefix fileserver, a file
on the file server share (for example, \\server\share\folder\myDoc.doc) might be mapped by the repository using the following RID:
'/fileserver/folder/myDoc.doc').
The hierarchical namespace structure implies a parent-child relation between the resources identified by an RID's path segments. In the figure
above, /etc/oth is a child of /etc, while /etc is parent of /etc/oth.
The hierarchical namespace implies that the RIDs of the parents of a resource along the hierarchy's way up to the root resource can be derived from the
resource's RID, because they are separated by the '/'.
The part of the RID before the last '/' is therefore called the path of the resource, while the part after the last '/'
(without the optional query) is called the name of the resource.
The namespace of a specific repository might be restricted in the sense that paths or names have a limited length. Therefore repositories have to provide name info including which restrictions apply to the RIDs and which characters are not allowed to be used in names.
RF objects that are able to have references to child resources (can be parents) are collections. Each "pathsegment" of an RID's path must identify a collection.
RIDs can either be absolute (start with an '/') or relative. The '.' denotes the current collection and '..' denotes the parent
collection, just like in a file system. Relative RIDs can only be used for certain operations (like copy or move). RIDs returned by the RF are usually
absolute, except for link targets (see Links).
Each resource is identified by its "pathsegment" only, without the "query". The query part should only be used for adding context information (see below) or to select variants in the future. If an RID passed to the RF for retrieving a resource has query parameters, the retrieved resource's RID may differ from the RID that was initially passed to the RF.
The RID is only one part of information required for retrieving resources. The other part is the access context, which provides additional information: While the RID specifies which resource to retrieve, the access context contains further information about who wants to retrieve the resource (which in turn might affect the how).
The access context contains at least the user (or, more generally, a principal) on whose behalf the application wants to retrieve the resource. Based on this additional information, a repository might carry out additional selections. For example, a repository that supports access control might check the user against the access control lists for the requested resource in order to determine whether access is allowed. Alternatively, the access context might contain information about the preferred language for the user, so the repository can select the proper version of an object according to the preferred language.
The access context provided by the RF's client is also referred to as the resource context.
see also: How to use the RF's Client API
Like files and folders in a file system, resources and collections can be created, copied and deleted.
Resources can only be created in a collection, which then becomes the parent for the new resource, while the newly created resource becomes a child of the collection in which it was created.
Resources can be copied to another RID, which then duplicates the data of the copied resource to the new resource for the given RID.
Finally, resources can be deleted. Once a resource is deleted it is no longer accessible, and neither is its content and metadata. If a code still holds a reference to the RF's resource object and tries to use the resource after it has been deleted, the RF raises an error.
See also: How to use the RF's Client API
Both renaming and moving a resource change the RID of the resource in question. From a client's point of view, both operations are identical, but from a repository's point of view, there is a slight difference between the two operations.
Renaming a resource only changes its name (the RID's "name" part). It does not change the parent-child relation - the resource remains in its parent collection.
Moving a resource can change any part of its RID.
Moving and renaming might be considered a bit like copying the resource to a new resource RID and then deleting the resource with the old RID. However, there are some major differences between moving or renaming, and copying/deleting: Copying/deleting creates a new resource with a new unique ID, while moving leaves the unique ID unchanged (see Unique ID for details about unique IDs). Copying/deleting also causes two events that are not related to each other to be sent. Renaming or moving produces only one event (see Events on events).
From an object-oriented point of view, moving only changes a business object's place in the hierarchy, while copying/deleting creates a new instance of a business object and releases the old business object instance.
Although moving a resource is usually faster than copying and deleting, it is usually slower than renaming, because additional steps have to be taken, especially if a resource is moved from one repository to another (which can not happen when renaming a resource).
See also: How to use the RF's Client API
As almost every file system has some concept for (soft) links or shortcuts, this is also true for the RF.
A link is just a reference to a real resource, the link target. A link that refers to a target that no longer exists is called a broken link.
The graphic above shows one link /repository/link1 that points to the collection /repository/folder/subfolder, and another link
/repository/link2 that points to the resource /repository/folder/subfolder/file.
The third link /repository/link3 points to a resource in another repository - /other_repository/folder/other_file.
All these links point to resources within the RF, so their link type is "internal link".
The fourth link, /other_repository/link4, points to a URL that refers to an external (web) resource. Because URLs (which are not RIDs) are not
within the RF's namespace, this link's link type is "external link".
Note: Although URLs are not within the RF's namespace, they can be mapped by the KM's "Web Repository Manager" to the RF's namespace.
With EP 6.0 SP1, flexible links were introduced as a new link type. A flexible link is an internal link that "follows" its target dynamically. While a "normal" internal link becomes broken if its target is moved or renamed, a "flexible" link does not. Flexible links are also referred to as dynamic links, while normal links are called static links. A flexible link's target has to reside in the same repository as the link, as the flexible link is deleted if the resource is moved to another repository.
See also: How to use the RF's Client API
Just as a file might contain raw data, a resource can have content as its data. The content of a resource is represented as a
java.io.InputStream. The RF does not consider the content of this input stream - it is simply passed to the responsible repository. Because the
RF does not have any information about the content itself, the content is also referred to as unstructured data within the RF.
As with files, additional information about the content exists. This content metadata, which describes the content data, is similar to an HTTP MIME header:
-1, not known by the repository (see
[RFC2616] - section 14.13).
text/html' or
text/plain; see [RFC2616] - section 14.17)
UTF8').
en')
W/'),
which only changes when the resource's content changes in a "semantically significant" way (see [RFC2616], section 13.3.3).
Thu, 10
Jul 2003 10:00:00'. See [RFC2616] - section 14.21).
Fri, 11 Jul 2003
11:00:00'. See [RFC2616] section 14.29).
See also: How to use the RF's Client API
Unstructured content and its meta attributes, which describe only basic aspects of the content, are only useful for bulk data, as used in documents. Any application dealing with the content has to "know" about the content's structure itself, because the RF doesn't know about it and can not provide any additional information.
To support generic applications that retrieve information about a business object's data structure dynamically and do not use "hard-coded" knowledge about these business objects, a mechanism for retrieving and storing structured data and some possibilities for introspection on these structures had to be introduced: The properties.
These properties should not be confused with the java.util.Properties. Properties in the RF are properties as defined by WebDAV (see
[RFC2518]). Properties are name/value pairs, where the name of a property defines its identity.
A property name must use the XML namespace mechanism, which provides a namespace and a local name. Thus, a property name is a tuple
(namespace, local name), where the namespace is optional and is similar to 'http://sapportals.com/xmlns/cm' or empty.
By convention, property names are usually displayed in the following form: {namespace}local name, for example,
'{http://sapportals.com/xmlns/cm}displayname'.
Property names are not hierarchical, so if there are two properties, for example, '{x}A' and '{x}A/B', no relationship between
these two properties is recognized.
Some properties contain data about data. This is why properties sometimes are referred to as resource metadata. Within the RF, properties may be used not only to provide data about the content, but also to provide additional data about the object itself. This is especially relevant for business objects.
Because properties are typed (see below), they are also referred to as structured data within the RF.
Properties are either "live" or "dead":
{http://sapportals.com/xmlns/cm}contentlength' for resources have to be long integers and contain the length of the resource's content in bytes.
Most of the live properties are computed by the repository and cannot be changed by the client application. When a client changes a live property, there are
some side effects in the repository.
As mentioned, properties are typed as variables are typed in most computer languages. The following property types are defined:
java.sql.Date.
java.sql.Time.
java.sql.Timestamp.
boolean value (either false or true).
In the same way that content has metadata, a property might also have metadata: The property attributes. Like properties, some of these attributes are "live" and are maintained by the repositories. The live attributes known to the RF are the following flags:
true, if the property value is a list of values with the given type.
true, if the property value must remain once set. That is, if the property is set, it cannot be deleted,
although its value can be changed and new resources might be created without the property being set for them.
true, if the property value cannot be changed.
true, if the property is for system use only and not intended to be displayed to users.
Another attribute is the property description, which is a localized description of the property or at least the property's name, if no description exists.
Additional "dead" attributes can be added to a property (as additional attributes are added to an XML element). They are simple key/value pairs, where
both, key and value, are strings - like java.util.Properties.
There are two special properties for a resource within the RF:
One special property is the display name of a resource: It provides a "user friendly" name for the resource, whereas the RID's name part is a technical name. While the RID's main task is to identify the resource uniquely within the system, the purpose of the display name is to make a resource in a collection identifiable to users.
For example, a news item might have /news/20030708/110907003.1.xml as its RID, but "latest news from the RF (11:09:07)"
as its display name.
The resource's display name is optional and might depend on the language of the resource context. If the display name does not exist, the RID's name part is usually taken as a fallback option. Display names might not be unique, even among the children of a collection.
The other special property is the resource type: Similar to the content's MIME type, which specifies the content type, the resource type specifies the type of resource. It provides a string value that specifies the resource type.
This type is used, for example, in the type handling mechanism (see Type Handling) and the object type handler (see How to use the RF's Global Services).
The resource type is optional and might not be set.
See also: How to use the RF's Client API
When inserting or retrieving children to/from a collection, it is usually desirable to have them sorted in a certain order, especially when displaying them.
The RF offers methods for retrieving the order mechanism of a collection in order to enable client applications to find out if and how a collection's children will be sorted in the repository's persistency.
There are three different order mechanism types in the RF:
Collections that support this mechanism are called ordered collections. A collection's server order mechanism cannot be changed - the sequence of children is always defined by the order mechanism's specification. However, it might be possible to choose another server order mechanism for a collection.
Whenever a new child is created in the collection, its new position is automatically determined with respect to the ordered collection's order mechanism.
An ordered collection with a manual order mechanism preserves the position of its child resources as defined by an application. These positions can be changed by the client application as desired, and new child resources can be inserted anywhere in the sequence of children.
In addition to these order mechanisms, which have to be supported by the repository's persistency itself, another way of sorting has been introduced with EP
6.0 SP1: Collators allow the specification of an ordering clause, similar to an SQL ORDER BY clause on properties. Collators define a list
of collator entries that in turn specify the name of the property to order by and whether to sort in ascending or descending order.
Unlike an order mechanism for a collection, collators affect the order for a retrieved list of resources to which they are applied (for example, the children of a collection or the result of a property query). An order mechanism affects the order of children as they are created or changed in the repository's persistency.
Note: Although the RF is now able to deal with collators, their functionality has not yet been made available to client applications. This functionality will be released at a later date.
See also: How to use the RF's Client API
In RF Objects, the parent-child relation of collections and resources has been described as a way of navigating through the hierarchy of objects in a repository. For large hierarchies, especially when the client application is only interested in specific resources, it might not be very efficient to retrieve all these parent-child relations and then select the relevant resources on the client's side. ´Two mechanisms have therefore been introduced within the RF:
The first is the property search mechanism, which works on the properties of a resource and is similar to an SQL WHERE clause on
properties for resources.
This search mechanism must not be confused with the TREX search engine, which works on entire resources and even enables content to be searched for embedded words, and so on. The property search relies on mechanisms provided by the repositories that hold the resources. The result of a property search is always up-to-date, while the update intervals of TREX might vary depending on how TREX is configured. However, with a property search, only properties can be used as selectors.
As with SQL, where a SELECT expression returns a list of records that match the SELECT expression's WHERE clause, a
RF property query expression returns a list of resources that match the given query expression.
A query expression is constructed with the help of a query builder and can look like the following:
( '{http://sapportals.com/xmlns/cm}contenttype'like '%/jpg'
or '{http://sapportals.com/xmlns/cm}contenttype' like '%/gif'
) and '{http://sapportals.com/xmlns/cm}createdby'like 'admin'
The example above returns all resources where the content type ends with …jpg or ...gif and that were created by the
admin user.
The sub-tree of a hierarchy to be searched in can be restricted by the following parameters:
A property search does not follow links. This ensures that only resources in the specified sub-tree and from the same repository are found.
See also: How to use the RF's Client API
As explained in RF Objects, RIDs belong to a hierarchical namespace. Due to this, some operations that change the hierarchy, especially moving and renaming a resource, also change the RID of a resource.
If an application needs to access resources regardless of their new location, there are two options:
The RF supports the second option in two ways:
See also: RF Extensions
To prevent sensitive information being shown to unauthorized users, some kind of security mechanism had to be made available.
When a user tries to carry out any action on a resource, the repository checks if he or she has the appropriate permissions for performing the requested action on the given resource.
The RF distinguishes the following permissions:
Usually, an application does not have to concern itself with the security of the RF, because each time it tries to access a resource (read or modify it), the repository manager or service in question checks the relevant permissions.
Only RF extensions, or those applications that offer users the possibility of changing the permissions, need further knowledge on how these permissions are checked by the relevant resource's repository.
A widely used mechanism for assigning permissions to resources and users is the access control list (ACL). An ACL is associated to a resource and has at least one owner. Because owners are allowed to modify the ACL, they are also granted full control for the resource, since they would be able to assign that permission to themselves anyway.
In addition, an ACL can contain one or more Access Control Entries (ACEs). Each ACE stores at least a tuple (principal, permission), where "principal" is, for example a user, group, or role. ACEs can be positive or negative, where positive ACEs grant permissions to a user and negative ACEs deny permissions to a user. This extends the tuple to a quadruple (principal, permission, negativeflag, priority), since mixing negative and positive ACEs requires a priority to be defined, as several ACEs can match for any given user.
Although negative ACEs are supported within the RF, they are not currently used. This is because they proved rather confusing for users.
A permission is usually granted for a user on a resource if he or she is the owner of the ACL or if at least one (positive) ACE exists that matches the given user and permission for the given resource. If a resource has no ACL assigned to it, the resource's parent ACL is used. This process of inheritance continues until either an ACL is found or the root resource is reached. According to this mechanism, a resource inherits its ACL from its parent if it does not have an ACL on its own.
If an ACL exists, all permissions are implicitly denied to users who are not owners of the ACL. Only those users explicitly stated by an ACE are granted permission.
If no ACL exists, all permissions are implicitly granted to all users.
See also: How to use the RF's Client API
If two users were to modify the same resource at the same time, this would lead to the "lost update problem": User A edits resource R, while
user B does the same and saves R, then A saves R. The updates of user B for R are lost.
A mechanism for serializing access is therefore required:
The RF offers locks in order to achieve serialization: An application issues a lock request for a resource and user. If there is no other blocking lock (for another user) on this resource, the application can obtain the lock. If a blocking lock already exists for the resource (that is, for another user), the request is denied. Furthermore, access of a particular kind (write or read) is only allowed for users who obtain the relevant lock.
There are three important aspects that affect the behavior of locks:
Scope
Kind
Depth
For example, a shallow, exclusive write lock prevents any other modifying access to the resource and blocks any other write lock from being obtained while allowing further read locks.
If a collection is locked, namespace operations are also considered as locked operations. In other words, in the given example, creating, deleting, moving or renaming would also be blocked.
The scope of a lock controls how many users can request locks for a resource, while the kind determines the intended type of operation and specifies how many locks of this kind are permitted.
Currently, only write locks are used in the RF.
Locks can also have a lock timeout. When this timeout expires, the lock is automatically released. This can be very useful in a distributed Web scenario if a remote application has locked a resource but cannot release it because the remote application's connection to the server has become unavailable.
Once an application has obtained a lock for a resource, it must not lock the resource again. Therefore, methods are provided to check whether a lock already exists for a resource and the user to whom it is assigned. If it is a lock with a timeout, the application should refresh the lock instead of relocking the resource. This extends the lock's timeout.
See also: How to use the RF's Client API
As shown in Renaming and Moving, some applications might need to keep track of changes to 'their' resources.
To support this requirement, the RF offers several events for resources.
Although the RF generates these events, which it defines (see below), clients cannot rely on events being supported for every resource. This is because events might be disabled (in the configuration) or as a result of some operations in the backend system that are neither routed through the RF nor reported by the repository.
In order to receive events, a client must first register itself as an event receiver with the event broker of the relevant resource's repository. When a client registers, it can choose from the following two modes:
The following types of events are defined within the RF for listed operations:
The events are sent as pre-events before an operation starts and as post-eventss after the operation has been completed. If the operation fails, no post-event is sent. To identify events that belong together (for example, pairs of pre- and post-events), events can provide a correlation ID.
As well as its type, which defines the operation causing the event, the event contains the resource the operation was applied to.
A client that uses the resource as passed in the event must be aware that some operations invalidate the resource - namely a delete operation destroys the resource. If the resource has been invalidated, only its RID can be retrieved.
See also: How to use the RF's Client API
When the RF matured from a document framework to a business object framework, a problem arose. Two aspects of business objects had to be covered:
Something like a type cast mechanism was therefore needed within the RF to "cast" a resource (which represents the unified aspects of an business object) to any type of semantic object (which represents the application-specific aspects of an business object), and vice versa.
It might be too expensive for a repository to expose its business objects as java objects that implement all potentially relevant semantic objects they are able to represent. Because of this, no usual Java type cast could be used.
A piece of code is therefore needed to convert a resource into a specific semantic object (that is, to convert the internal handle of a repository to the real business object). This "converter" or object factory encapsulates the knowledge of how to convert a given resource into a specific semantic object. It simply de-serializes the specific semantic object from the resource (for example, using properties or the resource's content). The object factories have to register themselves with the RF during start-up in a semantic object factory registry, so that the RF is aware of all available object factories (and therefore of all semantic objects into which a resource can be converted).
The reverse method of casting a semantic object back to a resource is not yet available in the RF, but will be made available in the future. In a very similar way to JNDI, where object factories are used to retrieve objects from the persistency and state factories are used to store objects in the persistency, a state factory will be used in the RF to serialize a semantic object into the properties or content of a resource. The state factory encapsulates the knowledge of how to convert a specific semantic object into a resource.
see also: How to use the RF's Client API
This section describes the several versioning concepts within the RF. These are similar to those of WebDAV as described in [RFC3253].
When a resource is updated, either by editing its content or changing a "dead" property value, the old content or property value is lost.
When resources are put under version control, the old values are preserved when the resource is updated. As namespace operations (see Creating, Copying, and Deleting) change the RID, changes of content or "dead" properties change the version-controlled state of the resource. The version-controlled state is therefore at least all content (structured and unstructured) of the resource. Versioning allows the preservation of snapshots of these states over time.
A versionable resource is a resource that can be put under version control. When the resource is placed under version control (when version control is enabled for this resource), it becomes a version-controlled resource (or VCR for short) with the initial check-out state "checked-in". In addition to this, an initial version resource (also called revision or version) is created with a new RID.
Only checked out VCRs can be changed. When a VCR is checked out, the VCR's check-out state is changed to "checked out" and the current version-controlled state of the VCR is marked as the checked out version.
After content or property changes have been applied, the checked out resource has to be checked back in again. This causes the VCRs check-out state to revert to "checked-in", and creates a new version from the version-controlled state of the checked in resource: Every time a checked out resource is checked in, version control creates a new version of the resource with its own RID - the behavior for generating these RIDs is repository specific.
While each subsequent check-in generates a new version, a version history is generated over time. In the figure above, a linear version history is
shown, where v1 has been edited and became v2, then v3 and the current version is v4. The first version,
which started the version history (v1 in the example), and was created when version control was enabled for the resource, is also called the
root version. As well as the version history, the VCR also supplies the content and the properties of the current version, which is the version-controlled state of the latest checked in version (if not updated from another version).
When a VCR is checked in, this checked in VCR's version-controlled state becomes the successor of the checked out version. The previously
checked out version becomes the predecessor of the version created when the VCR was checked in (that is, v3 is the successor of v2
and v2 is the predecessor of v3). An ancestor of one version (for example, v3) is any version that is connected
to this one version by one or more successor relations (that is, v1 and v2 are ancestors of v3). Conversely,
descendants of one version are versions that are connected to this one version by one or more predecessor relations (that is, v3 and
v4 are descendants of v2).
A version history with each version having at most one successor (or no successor for the current version) is a linear version history, as shown in the example above.
When a second successor is added to a version, this creates a fork or branch in the version history -
v3 and v4 are both successors to v2 in the example below.
If a version is created with more than one predecessor, this creates a merge in the version history, that is, v7 is merged from
v5 and v6.
Both branching and merging are described in further detail below (see Advanced Versioning).
Versions can be tagged with labels in order to organize them. Versions in different version histories of different VCRs can be tagged with the same identifier. This can be used to group sets of versions across different root versions (see Advanced Versioning).
RF basic versioning supports linear version histories with in-place check-out for non-collection resources only.
When the client requests an in-place check-out for the VCR, the VCR's check-out state simply changes to "checked out". The current checked in version becomes the checked out version of the VCR and might be updated by the client.
The repository might internally generate an RID for the version that would be created on check-in in advance. Some repositories might support this expected RID being returned to the client, but this function is not supported by all repositories.
The client then changes the content and/or the properties of the checked out VCR resource "in-place". When completed, the client requests the checking in of the VCR. This causes a new version to be created from the current version-controlled state of the edited resource (its content and dead properties).
If no other version has been created in the meantime (see below), the expected RID - if generated on check-out - is used for the new version, and the VCR's check-out state is set to "checked in" again.
If the user did not lock the VCR (see Locking on locking), it might be possible for a second user to edit the resource, check in the changes, and then check the resource out again. If the (first) user then edits the resource and checks it in (after the second user has performed a check-in/check-out), the version is a successor of the version created at the second user's check-in. To avoid such a situation, the version's expected RID can be used as long as it is supported by the repository.
See also: How to use the RF's Client API
A working resource can be thought of as a temporary resource on the server's side (in the backend), whereas with in-place check-out, the client has to maintain the content and properties locally.
When the client requests the check-out of a working resource, an "intermediate" VCR for this version is created, which is marked as "checked out". The client then edits this "intermediate" VCR (with the "intermediate" VCR being the working resource - it does not edit the "original" VCR as with in-place check-out).
When the client requests the check-in for the checked out "intermediate" VCR, the new successor of this version is created from the version-controlled state of the "intermediate" VCR, and the "intermediate" VCR is thenreleased.
Working resources support a non-linear version history and enable users to work on different versions in parallel.
A branching scenario might then look like this:
While the first check-out/check-in (check-out 1 and check-in 1) creates a new version (v21) as described above, the second check-out (check-out
2) also refers to v1. When VCR2 is checked in again, the new version (v22) is created as another successor of
v1.
Branching can also be achieved explicitly by checking out a specific version (not the current version as in the example above).
This feature is useful if a given version has to be maintained while regular changes continue. Consider some development source code: When a code version is released to the customer, one branch might be used for bug fixes only, while another branch might be used for normal development and code enhancements.
As explained above, the repository generates the RID of a checked out resource internally - the client is not able to control the namespace. This might be irrelevant for a single resource, but if a hierarchy of resources is to be checked out by the client, it is useful to preserve the namespace of such a hierarchy with the checked out resources, too.
The workspace concept was therefore introduced: A workspace is a collection, and each resource belonging to the workspace is a workspace-controlled resource (WCR for short). A workspace is a bit like a temporary folder on the server, in which the client can store resources.
When the client requests that a workspace be created, it supplies a name for the workspace - the workspace is then created on the server side with a server generated RID. The client can then create non-versioned resources and collections as in a normal collection and can also create VCRs for specific versions in that workspace. The client is therefore able to control the RID of the working resources when creating a version-controlled resource (VCR) as a working resource inside the workspace.
There is one major restriction for working resources within a workspace: There are no VCRs that share their version history. In other words, a workspace does not contain VCRs for different versions of the same resource. Each version referenced by a VCR inside a workspace belongs to a different version history.
This feature is useful - especially when used with labels - if a set of versioned resources that are related to each other has to be changed. A label can be used by the client to identify the versions that belong together, for example, by tagging a specific list of changes (change list).
The picture shows three resources (R1, R2 and R3) in the resource hierarchy, with their versions. When checking out
the versions of those resources, a client might use a workspace in order to preserve the hierarchy (by creating parent collections for the three versions to check out
in the workspace). A label (CL4711) identifies the versions that belong together on the time line (R1v5, R2v4,
R3v2).
Collections can support auto versioning: If enabled, any change to the children (which are VCRs) implies that a new version is created automatically for the VCR - without a check-out/check-in request issued by the client (so versioning-unaware clients also create new versions for versioned resources).
Automatic version control is different to auto versioning: A collection with this feature enabled automatically places each newly created plain child resource under version control.
While the state of a plain resource usually consists of its content and dead properties, the state of a collection also includes the local names of the collection's children (and their ordering, if it is an ordered collection as described in Ordered Collections).
Therefore, if version control is enabled on a collection, a version of a version-controlled collection (VCC, or versioned collection) also preserves its children's names (with ordering) - in addition to the dead properties and content. In other words: the version-controlled state of a VCC also contains its children's names and ordering information.
In the case of the list of children's names of a VCC's version, only the versioned children's names (which are VCR's) are saved. The non-versioned children are listed as children, but their names are not stored with each version of the VCC.
The graphic above shows a version-controlled collection C that initially contains R3 as a non-versioned child and
R2 as a versioned child.
A new version-controlled resource, R1, is then created. C has to be checked out first in order to create R1.
Then C is checked in again.
If a version for a VCC's child VCR (R1 or R2) is created, no version is created for the VCC. Also, if a new non-versioned resource
(R4) is created, no new version is created for the VCC.
This results in the following versions for C:
C version 1 contains R2 and R3 as children
C version 2 contains R1, R2 and R3 as childrenC version 2 (Cv2´) contains R1, R2, R3 and R4 as children, after
R4 has been created, but it is not a new version.
Because the names of non-VCR children are not saved in versions of the VCC, the following situation can occur: If a versioned resource once existed for a given RID as a child of the collection, but was then deleted, and if finally a new resource with the same RID (without version control) is created, the resource that was once versioned is eclipsed by the newer, not-versioned resource.
In the graphic above, consider that the last version of R1 (R1v2) has been deleted and a new resource R1', with the
same RID as R1, has been recreated, R1' now eclipses R1.
As depicted in the graphic above, only the references to the names of its child-VCRs are saved as the VCC's version-controlled state. It is therefore not possible to preserve information on which versions of the VCC's child-VCRs belong to a VCC's version: Consider that Cv2 originally referenced R1
with R1v1 as the current version, R2 with R1v1 as the current version, and R3, whereas Cv3
references R1 with R1v2 as the current version instead, since a new version for R1 has been checked in. If C's
current version state were now to be retrieved from Cv2, it would still reference R1 with R1v2 (as well as
R2v1 and R3 of course), because Cv2 does not contain the information that R1 referenced R1v1
when Cv2 was created.
Such a baseline feature is not yet supported by the RF.
This section focuses on the main RF building blocks.
From an application point of view, the RF offers access to the resources stored in the information sources by unified methods. Built-in or plugged-in extensions offer additional aspects of the unified RF resources. Some other plug-ins enable the casting of unified resources onto objects that expose semantically specialized aspects of an information object to the application.
Internally, the RF's core consists of various registries where the different extensions are plugged-in.
There are the following types of RF extensions:
text/html.
There are other types of extensions that do not belong to the RF itself, but are offered by the built-in extensions of the RF:
ISchedulerTasks as plug-ins that are run by the scheduler (see RF Services)
The RF is built in several layers and relies on other software layers, as shown in the graphic below:
For historical reasons, two flavors of the RF API layer exist - the old and the new API:
com.sapportals.wcm.repository.* and
com.sapportals.wcm.service.*
com.sap.netweaver.bc.rf.common.*
com.sap.netweaver.bc.rf.ci.*
com.sap.netweaver.bc.rf.mi.*The current API's client interfaces (on the left-hand side) specify the current interfaces as used by RF applications. These interfaces are described in How to use the RF's Client API in further detail. The new API's client interfaces (in the middle) will probably not be released before Netweaver'05 (or maybe even after Netweaver'05) and are not yet covered by this document.
The current API is used for the development of applications and all RF extensions except repository managers.
The new API's repository manager interfaces (on the right-hand side), and the common interfaces used by them, are now released in EP 6.0 SP1 for restricted use. They are not yet finalized - small changes (such as adding methods for mass operations in order to increase performance) might still be applied to these APIs. The transition for the utility interfaces is also not yet completed. The new API will be released in SP2 for unrestricted availability.
The new API is currently (EP 6.0 SP1) used for the development of repository mangers only.
The layer of the additional RF libraries includes some utility packages that are used by the RF. It also contains the component runtime. This layer does not deal with RF objects. However, since the RF uses this layer, and because it is developed by the RF team, it became an additional part of the RF API.
com.sapportals.wcm.util.*
New API: com.sap.netweaver.bc.rf.util.*com.sapportals.wcm.crt.*
The layer with other frameworks and utilities contains:
com.sap.tc.logging.*
com.sapportals.config.*
com.sap.exception.*
com.sapportals.portal.security.usermanagement.* for the old user management and com.sap.security.api.* for new
user managementDocumentation for these components is available in the JavaDoc for these packages.
Most of the concepts described in this section apply to the new API only. Since the new API's client interfaces are not released yet, this section might be most interesting for developers who want to develop a repository manager.
Section 1 described the several aspects of the RF objects. These are
The new API reflects these aspects: The packages for common (com.sap.netweaver.bc.rf.common), manager
(com.sap.netweaver.bc.rf.mi) and client (com.sap.netweaver.bc.rf.ci) interfaces are grouped accordingly:
namespace (includes property search)
content
properties
idmapper
lock
security
type
version
Not all the operations of an aspect are applicable for all RF objects. Some aspects or operations of an aspect might be only applicable for RF objects of a specific repository or only for specific RF objects.
For example, files on a CD-ROM drive might only be exposed as read-only by a repository manager. Or it might be that a file cannot be placed under version control because the repository manager does not support it. It is also possible that a file system repository might not support content operations for collections and links.
To reflect these restrictions, supported options are used: Repository managers report the options that they do support to the RF.
In the current API, the possible options are defined in a single SupportedOptionenumeration in the packagecom.sapportals.wcm.framework.enum. With the new API, the SupportedOption classes are separated for each aspect and are
therefore located in each aspect com.sap.netweaver.bc.rf.common package (for example,
com.sap.netweaver.bc.rf.common.namespace.SupportedOption for supported namespace operations).
If an operation that is not supported by the repository manager is requested on a resource, the repository manager throws a
NotSupportedException (in the current API) or an OperationNotSupportedException (in the new API).
Although clients can retrieve the list of a resource's supported options, this list does not necessarily reflect the list of unsupported operations for the client, because the RF might map certain operations that are not supported by a repository itself to its internal default implementation of the operation (for example, if a repository does not support the copy operation, the RF maps it to an internal implementation that uses the create operation instead).
Within the new API, all operations are additionally grouped as read-only or read-write in separate interfaces. If the read-only operations are
located in an interface, their corresponding read-write operations are located in a mutable interface (prefixed by
Mutable).
For example, operations for reading the content of a resource from a repository manager are located in the interface IContentManager, whereas
the operations for writing content are located in IMutableContentManager.
The new API uses the SAP exception framework as the foundation for its exception handling.
The RF's new API offers one central exception:
com.sap.netweaver.bc.rf.common.exception.RepositoryException.
Two exceptions are derived from this exception:
OperationNotCompletedException, which is used for multi status results (see Mass Calls)
ResourceException, which is the base class of all RF exceptions in the current API.All other exceptions used throughout the new API are derived from this ResourceException.
Exceptions that are not specialized for one aspect belong to the common exception package (com.sap.netweaver.bc.rf.common.exception) and
can be found there.
Exceptions specialized for one aspect (for example, a WrongVersionException is only used by the version package), are placed in
the aspect's package.
The most frequently used exceptions in the RF are:
NotSupportedException / new: OperationNotSupportedException:enableVersioning() is called, but the repository does not support versioning.
MethodNotAllowedExceptionsetContent() is not supported for a
collection, but it is supported for a plain resource.
IOErrorException / new: IOOperationFailedException:ResourceNotFoundException:getResource() do not throw this exception but return null instead (if the resource is not found) or
throw an IOErrorException to indicate a problem with the backend system.
AccessDeniedException:updateContent() throws an AccessDeniedException.
When the RF's new API was designed, implementing the operations for all those aspects of RF objects would have led to multiple overloads of the RF
object's operations. For example, the create operation must be made available for plain resources, collections, links, versions, and so
on.
In order to avoid a large number of overloaded methods that would differ only in certain details, and in order to keep a lean API, descriptors were
used instead of overloading. Descriptors group coherent parameter blocks together in one reusable unit. In the example above for the create
operation, there are appropriate descriptors for creating a resource, collection, link or version - but only one create method.
The "overloading" is now carried out in the various constructors for the descriptors. This allows the existing API to remain unchanged when new aspects are introduced. Only the new aspect and its new descriptors have to be introduced in the API.
In the current API, descriptors are not used consistently; they are used for certain operations (for example, as ICopyParameter for the
copy operation).
Despite its advantages, this solution has its drawbacks: A descriptor is an additional object that needs to be allocated and removed from the memory if no longer referenced by the garbage collection. To minimize housekeeping efforts, descriptor objects should be reused where possible.
Therefore, two simple rules should be adhered to when using descriptors:
Within the current API, typed collections and iterators have been used where possible. For example, the returned list of a collection's
getChildren() method call is an IResourceList. The corresponding Iterator for such an IResourceList is an
IResourceListIterator.
With the new API, un-typed collections and iterators from the standard JDK API are used instead in order to reduce the number of classes in
the RF API. For example, a collection's getChildren() method returns a java.util.List.
For performance reasons, mass calls are also included in the RF. Mass calls allow clients to apply an operation on several resources at once, instead of applying the operation in a loop. An example is retrieving a specific property for a list of resources.
Applying the same operation with one mass call might be more efficient than executing several calls one after the other, especially when sending a request to a remote backend system.
Despite the improved performance, the only special feature of mass calls is error handling: When an error occurs (for example, if retrieving the requested property causes an error for one specific resource), the mass call continues trying to apply the operation on the remaining resources (that is, tries to retrieve the requested property for the remaining resources).
When the mass call completes without errors, it returns a list of results (usually a list of items or a map, using resource handles as keys and the corresponding return value of the (single call) operation for the key resource as values).
In the current API, mass calls were only used for setting properties (or for calls from the RF to the repository managers). If
an error occurs (when several properties were set), a special exception with a list of results is returned
(SetPropertiesExceptionin package com.sapportals.wcm.repository.).
If an error occurs in the new API, an OperationNotCompletedException is thrown (package
com.sap.netweaver.bc.rf.common.exception). This exception includes a multi status error result that consists of:
ResourceExceptions for the failed resources to which the operation could not be applied (getThrowables()). The
ResourceExceptions contains the RID of the resource (getRID()) and the Throwable (getCause())
that caused the error.
getPartiallyComputedResult()).
The RF defines all API calls as atomic, as long as it is not explicitly stated otherwise. The mass calls are not atomic, which is also explicitly stated in the Java documentation.
All API calls must be atomic on their work units, which are - by definition - atomic. A work unit is, for example, an operation that is applied only to one resource handle or property.
Therefore, all API calls that process several work units non-atomically (for example, mass calls), have to report both, the committed work units and the rolled back work units (as, for example, mass calls do with the multi status exception).
All independently executable work units (where a previous work unit is not required for a following work unit), are processed. For example, when a mass call applies an operation to several resources, this results in independent work units (one for each resource), all of which are processed.
As is usual in Java, most of the RF's definitions are interfaces. For most of the RF interfaces, a default implementation class is provided for the
corresponding interface, for example, Content for IContent in the com.sap.netweaver.bc.rf.common.content
package.
Only Exceptions, enumerations and some parameter structures (for example, Position in package
com.sap.netweaver.bc.rf.common.namespace for the definition of a resource's position in an ordered collection) have no corresponding
interfaces.
Neither RF applications nor RF extensions should work on a special implementation for an RF interface only. For example a repository's content submanager
(see RF Extensions) should not check for its own Content implementation on setContent(), because the client might have used the
default implementation or its own implementation.
Exceptions to this rule are:
IResourceHandle (see RF Extensions)
IQueryExpressions (see RF Extensions)