Personal tools
The Open Lighting Project has moved!

We've launched our new site at www.openlighting.org. This wiki will remain and be updated with more technical information.

Difference between revisions of "RDM Discovery"

From wiki.openlighting.org

Jump to: navigation, search
Line 1: Line 1:
This documents various scenarios that should be considered when writing software that performs RDM Discovery. Unlike [[DMX]], [[RDM]] is bi-directional, which means bad devices can trigger bugs in the controllers causing them to crash or go in to an infinite loop.  
+
This documents various scenarios that should be considered when writing software that performs RDM Discovery. Unlike [[DMX]], [[RDM]] is bi-directional, which means bugs in devices / responders can cause controller to crash or go in to an infinite loop.  
  
 
This page is not a substitute for the E1.20 standard, which takes precedence over all information presented here. It simply presents situations which designers of controllers should consider when writing their software.
 
This page is not a substitute for the E1.20 standard, which takes precedence over all information presented here. It simply presents situations which designers of controllers should consider when writing their software.
  
 
== Process Overview ==
 
== Process Overview ==
 +
 +
Each responder has a unique identifier (UID) which consists of a 2 byte ESTA manufacturer ID, and a  4 byte device ID. Like a MAC address, no two responders should  have the same UID. The 'all devices' UID (ffff:ffffffff) is a broadcast address, to which all devices will listen but not necessarily take action or respond.
 +
 +
There are three RDM messages used during the discovery process, all use the DISCOVERY_COMMAND Command Class.
 +
 +
* DISC_UNIQUE_BRANCH, which takes two parameters, a lower and an upper UID. Any responders within this range (inclusive) must respond.
 +
* DISC_MUTE
 +
* DISC_UN_MUTE
 +
 +
There are many different ways to implement the discovery process, one (simplified) method is described below:
 +
 +
# Send a DISC_UNMUTE to the ALL DEVICES UID
 +
# Send a DISC_MUTE to any previously discovered devices, if they don't respond after multiple attempts remove them from the UID list.
 +
# Send a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff). One of three things may happen
 +
#* No response, which generally means there are no further responders to be muted
 +
#* A single response with a valid checksum. In this case the controller should attempt to mute the responder (DISC_MUTE) and if that succeeds, add the responder to the list of UIDs.
 +
#* A collision, in which case a controller should divide the range in two (0000:7fff:ffffffff), (8000:00000000, ffff:ffffffff) and proceed from step 3.
 +
 +
If everything goes well, eventually all devices will be muted, and sending a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff) will result in no responses.
  
 
== Failure Modes ==
 
== Failure Modes ==
Line 9: Line 28:
 
=== Responders that reply outside their range ===
 
=== Responders that reply outside their range ===
  
 +
Responders may have bugs in the UID inequality code, causing them to reply to DISC_UNIQUE_BRANCH which don't cover their UID.  One example may be a responder that replies when just the Manufacturer ID part of the UID matches. Depending on the type of bug, discovery can sometimes proceed in these cases and usually all other devices are found and muted first before the bad responder is located.
 +
 +
The worst case is a responder that replies to every DISC_UNIQUE_BRANCH request, which usually prevents discovery of any other responders.
 +
 +
=== Responders that don't reply when within range ===
 +
 +
The opposite of the case above is where a responder fails to respond to a DISC_UNIQUE_BRANCH request for a range which it is part of. A common example of this is the off-by-one case, where a responder doesn't reply if it's at the endpoints of the range. This can cause responders to 'disappear' which happens when a range that previously caused a collision, produces no responses when split in two.
 +
 +
Without proper detection, this can cause controllers to loop indefinitely as they try to locate the missing responders.
  
 
=== Responders that don't reply to Mute ===
 
=== Responders that don't reply to Mute ===
Line 19: Line 47:
  
  
=== Responders that 'disappear' ===
 
  
  

Revision as of 20:42, 20 October 2011

This documents various scenarios that should be considered when writing software that performs RDM Discovery. Unlike DMX, RDM is bi-directional, which means bugs in devices / responders can cause controller to crash or go in to an infinite loop.

This page is not a substitute for the E1.20 standard, which takes precedence over all information presented here. It simply presents situations which designers of controllers should consider when writing their software.

Process Overview

Each responder has a unique identifier (UID) which consists of a 2 byte ESTA manufacturer ID, and a 4 byte device ID. Like a MAC address, no two responders should have the same UID. The 'all devices' UID (ffff:ffffffff) is a broadcast address, to which all devices will listen but not necessarily take action or respond.

There are three RDM messages used during the discovery process, all use the DISCOVERY_COMMAND Command Class.

  • DISC_UNIQUE_BRANCH, which takes two parameters, a lower and an upper UID. Any responders within this range (inclusive) must respond.
  • DISC_MUTE
  • DISC_UN_MUTE

There are many different ways to implement the discovery process, one (simplified) method is described below:

  1. Send a DISC_UNMUTE to the ALL DEVICES UID
  2. Send a DISC_MUTE to any previously discovered devices, if they don't respond after multiple attempts remove them from the UID list.
  3. Send a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff). One of three things may happen
    • No response, which generally means there are no further responders to be muted
    • A single response with a valid checksum. In this case the controller should attempt to mute the responder (DISC_MUTE) and if that succeeds, add the responder to the list of UIDs.
    • A collision, in which case a controller should divide the range in two (0000:7fff:ffffffff), (8000:00000000, ffff:ffffffff) and proceed from step 3.

If everything goes well, eventually all devices will be muted, and sending a DISC_UNIQUE_BRANCH with a range of (0000:00000000, ffff:ffffffff) will result in no responses.

Failure Modes

Responders that reply outside their range

Responders may have bugs in the UID inequality code, causing them to reply to DISC_UNIQUE_BRANCH which don't cover their UID. One example may be a responder that replies when just the Manufacturer ID part of the UID matches. Depending on the type of bug, discovery can sometimes proceed in these cases and usually all other devices are found and muted first before the bad responder is located.

The worst case is a responder that replies to every DISC_UNIQUE_BRANCH request, which usually prevents discovery of any other responders.

Responders that don't reply when within range

The opposite of the case above is where a responder fails to respond to a DISC_UNIQUE_BRANCH request for a range which it is part of. A common example of this is the off-by-one case, where a responder doesn't reply if it's at the endpoints of the range. This can cause responders to 'disappear' which happens when a range that previously caused a collision, produces no responses when split in two.

Without proper detection, this can cause controllers to loop indefinitely as they try to locate the missing responders.

Responders that don't reply to Mute

Some responders may not reply to the MUTE_DEVICE message. This case must be differentiated from the case where the MUTE_DEVICE message is corrupted or lost so the responder never replies. The pseudo-code in the E1.20 standard attempts to mute each device up to 10 times before continuing.

Responders that don't mute

Not to be confused with the scenario above, some responders may acknowledge the mute request but continue to respond to discovery requests. If there is only one responder that behaves this way controllers should be able to succeed,



Proxied Devices

The E1.20 states that a proxy shall not provide more than one DISC_UNIQUE_BRANCH response at once. This means that once the proxy has been located and muted, controllers must continue to


Example Implementation

The implementation of the RDM discovery algorithm as used in the Open Lighting Architecture can be seen here . The tests which cover the cases above are in DiscoveryAgentTest.cpp