Working with Publishing Hubs
Working With Publishing Hubs
A Publishing Hub is one or more installations of Percussion that have been dedicated exclusively to publishing. Publishing Hub's are often used in deployments where there may be a high volume of content contributors or publishing activities, to improve overall system performance. When a Publishing Hub is properly configured, all Scheduled Publishing activities are shifted to the Publishing Hub server from the primary editorial server (also known as the System Master).
Publishing Hubs are available for 6.7 and were enhanced to enable multi-hub deployments in version 7 of the product. For best performance, it is recommended to upgrade to the current release before deploying new Publishing Hubs.
Example Deployment: In the diagram above a single System Master server is used to service content editors that are creating content and performing On Demand Publishing. All scheduled publishing and editions are offloaded to one of four dedicated Publishing Hub servers operating in a load balanced cluster.
Understanding Publishing Performance
There are many factors that go into the publishing performance of a CM System implementation.
- Template Design.
- Edition Size.
- Parallel Editions.
- Database Performance.
- Disk I/O.
- Network Latency.
- CPU Performance.
When an edition runs, the content repository is queried for the items linked to any content lists associated with the edition, and this list is queued for Assembly into pages. As pages are assembled, they are written to temporary locations on the local disk, and the back-end content repository is queried for the content to assemble into Pages via Velocity code from the template's component items, slots, snippets, relationships etc. Once all pages associated with an Edition have been written to temporary files, the content is then transferred to it's final delivery location. Finally any Post Edition tasks are run, often an rsync or other scripted network transfer to push content to it's final location.
On a busy server with heavy publishing schedules or heavy volumes of editorial, this publishing process can create a contention for Server resources on the CM System server. Resulting in slower overall publishing performance, or slower overall editorial experiences for end users.
How Publishing Hub's Help
Publishing Hub's help overall system performance by offloading scheduled publishing editions to a separate server. Freeing up CPU and Disk I/O operations on the System Master server to be dedicated for Content Editorial tasks and shifting publishing tasks to the separate Publishing Hub CPUs.
Multiple Publishing Hub's also enable parallel editions, so that Publishing Schedules can safely overlap and be published in parallel. One good method for estimating the number of publishing hubs needed in a deployment is to determine the number of parallel editions that are needed to deliver content simultaneously. This is an especially common requirement in multi-site deployments.
Publishing Hubs maintain their own schedules, isolating jobs from the System Master. Scheduled jobs may run on any Publishing Hub in the cluster that is idle when publishing is triggered.
Understanding the Role of the Database Server
The back-end database server plays an important role in overall CM System performance. A Database Server or Database Instance that is overloaded can result in poor overall system performance. Adding one or more Publishing Hub's to an already overloaded database server can actually cause system performance to decrease due to the increased load from both CM System instances on the slower database server. CM System is a connection oriented application and as the vast majority of content is stored on the database server, database I/O performance is very important to CM System publishing. Before deploying a Publishing Hub, ensure that the Content Repository database instance has enough CPU, Memory, Storage, and Connections available to handle current system loads.
Configuring a Publishing Hub
Properly configuring a Publishing Hub involves completing a series of steps.
- Install CM System selecting the Publishing Hub option on the target server, use a scratch new database during the installation. The database can be dropped after step 2.
- Setup file system synchronization from the System Master.
- Setup time synchronization on both servers.
- Add scheduled Editions to Publishing Hubs
- Remove Schedule Editions from System Master
- Disable Full Text Indexing and Aging Transitions
Synchronizing System Master configuration to Publishing Hubs
CM System stores portion's of the products configuration data on the server's file system as well as in the Content Repository database. Adding Publishing Hub's to a CM System deployment architecture adds an additional synchronization requirement that must occur between server's in order to keep the configuration data consistent between all servers in the deployment topology.
Microsoft Windows Servers may use Distrubuted File System Replication or tools likeRoboCopy or rsync. RedHat or Ubuntu users will typically use rsync to keep files synchronized.
Recommended Exclusion List
This list of files should be excluded from file replication or synchronization of the System Master installation tree:
- ObjectStore/.rxlocks
- ObjectStore/UserConfigurations
- sys_search/lucene
- dbg*.xml
- *.tmp
- *.jks
- console.log*
- server_run_lock
- velocity.log*
- temp
- rxconfig/Server/config.xml
- rxconfig/Server/server.properties
- rxconfig/Server/requestHandlers/agentmanager.xml
- AppServer/server/rx/deploy/rx-ds.xml
- AppServer/server/rx/tmp
- AppServer/server/rx/work
- AppServer/server/rx/log
- AppServer/server/rx/data
- AppServer/server/rx/deploy/rxapp.ear/rxapp.war/WEB-INF/config/spring/server-beans.xml
- AppServer/server/rx/deploy/jboss-web.deployer/server.xml
Exclusion List for 7.3.2
If JBoss has been removed (AppServer folder) then excludes starting AppServer can also be removed from the list.
- ObjectStore/.rxlocks
- ObjectStore/UserConfigurations
- sys_search/lucene
- dbg*.xml
- *.tmp
- *.jks
- console.log*
- server_run_lock
- velocity.log*
- temp
- rxconfig/Server/config.xml
- rxconfig/Server/server.properties
- rxconfig/Server/requestHandlers/agentmanager.xml
- AppServer/server/rx/deploy/rx-ds.xml
- AppServer/server/rx/tmp
- AppServer/server/rx/work
- AppServer/server/rx/log
- AppServer/server/rx/data
- AppServer/server/rx/deploy/rxapp.ear/rxapp.war/WEB-INF/config/spring/server-beans.xml
- AppServer/server/rx/deploy/jboss-web.deployer/server.xml
- jetty/temp
- jetty/base/logs
- jetty/attachments
- jetty/base/etc/installation.properties (Only exclude if pubhub file needs different port/ssl config)
- jetty/base/webapps/Rhythmyx/WEB-INF/config/spring/server-beans.xml ( If quarts scheduler configured differently for pubhubs)
Synchronizing System Time
The Network Time Protocol (NTP) is a network protocol that is designed to keep the date and time synchronized between one of more servers on a network. All operating systems supported by Percussion CM System offer some form of NTP Service. When configuring Publishing Hubs it is important the server's running all instances of Percussion are also configured to synchronize their clocks.
Microsoft Windows Server products ship with the Windows Time Service. In a Windows Domain Network this synchronization is performed by the Primary Domain Controller.
Most Linux servers support ntpd or ntpdate. Instructions for configuring synchronization for Ubuntu and RedHat server's can be found on their web sites.
Disable Full Text Indexing on the Publishing Hub Server
Full Text Indexing is a content editorial service that should only be performed on the System Master.
Disable Aging Transitions on the Publishing Hub Server
Aging Transitions is a content workflow service that should only be performed on the System Master. Below are the steps to do this:
- Please access the file - <PubhubServer_Home>\rxconfig\Server\requestHandlers\agentmanager.xml
- Comment out the aging agent section in here.
- Restart the pubhub server if it was already running.
Coordinate Service Restarts
Publishing Hubs work in coordination with the System Master. In order to keep publishing operations synchronized and running smoothly, it is a best practice to stop all publishing hubs in the cluster when restarting the System Master. Then starting the Publishing hubs backup when the System Master has recovered and fully started.
Advanced Configuration & How To's
How to Change the Scheduler Used by a Publishing Hub
CM System 6.7 Resources