We've launched our new site at www.openlighting.org. This wiki will remain and be updated with more technical information.
Difference between revisions of "Open SLP Notes"
From wiki.openlighting.org
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | When I started working on the E1.33 project ([[RDMNet]]) one of the first things I did was search for an open source implementation of SLP. I came across http://www.openslp.org which looked promising since it's been ported to a variety of systems, and is backed by the likes of Novell / SuSE and of course, ETC. However since testing it, I've grown more disillusioned with the project, to the point where I honestly can't recommend it to anyone looking to implement E1.33 | |
+ | This page has my notes on trying to get openslp working on a number of systems. | ||
+ | |||
+ | === Precursor === | ||
+ | |||
+ | For many of these you could argue "that's unlikely to happen and besides SLP doesn't guarantee consistency" . While that's true, it's not particularly the stance I want to take when discussing critical show control software. If RDMNet gets a reputation for being unreliable the hard work of the task group over the last 3 years will have been wasted. | ||
+ | |||
+ | === Release vs Head === | ||
+ | |||
+ | The last stable release of Open SLP was 1.2.1 in 2006. | ||
+ | |||
+ | I wouldn't even bother trying to get the 1.2.1 release to work. It was only once I started running the version from HEAD that features like DAs worked correctly. Since 1.2.1 is the version in most package management systems this means we (as in the [[OLA]] team) either need to statically link against a newer version, or take up the packaging role for ubuntu, debian, macports etc. | ||
=== Registrations timing out too early === | === Registrations timing out too early === | ||
Line 8: | Line 19: | ||
Besides fixing the slpd code, the only way around this is to re-register less than 15 seconds before the expiry interval. For this reason I recommend the absolute minimum SLP lifetime used is 30s. | Besides fixing the slpd code, the only way around this is to re-register less than 15 seconds before the expiry interval. For this reason I recommend the absolute minimum SLP lifetime used is 30s. | ||
+ | === Race conditions during registration / deregistation === | ||
+ | |||
+ | The way the SA code works is something like this: | ||
+ | - receive SrvReg from client and process it (SLPDProcessMessage) | ||
+ | - Figure out which DAs to register with (SLPDKnownDAEcho) | ||
+ | - Enqueue to the DA socket's sendlist (SLPDKnownDAEcho) | ||
+ | - Send write the message to the socket (SLPDKnownDAEcho) | ||
+ | - Call SLPDOutgoingRetry() periodically to resend any un-acked messages from the socket's send queue. | ||
+ | - Abort the registation attempts if we've hit the MAX_RETRANSMITS limit | ||
+ | |||
+ | The problem here is that if the state of the registation changes while the registation procedure is in process, you end up with a mixture if SrvReg and SrvDereg messages. A DA which has just come up from reboot can possibly get incorrect messages about the state of a service. | ||
+ | |||
+ | === Lack of aging during retries === | ||
+ | |||
+ | Another problem with enqueuing the message buffer on to a socket-specific data structure is that if a retransmission event occurs, the message data isn't aged correctly. This generally works out since openslp retries quickly but it means lifetimes of services may vary across DAs. | ||
+ | |||
+ | === Default behavior is to run as root === | ||
+ | |||
+ | Unless you're careful about how you build slpd it runs as root. Some distros (Debian for one) appear to have fixed this. | ||
=== Interface Selection on Mac OS X === | === Interface Selection on Mac OS X === | ||
Line 61: | Line 91: | ||
Startup complete entering main run loop ... | Startup complete entering main run loop ... | ||
</pre> | </pre> | ||
+ | |||
+ | === Lack of Logging === | ||
+ | |||
+ | The 'logging' in slpd mostly consists of printing out the sent and receives SLP messages. This isn't helpful at all when trying to figure out why an SLP client isn't working correctly. | ||
=== Denial of Service against libslp === | === Denial of Service against libslp === | ||
Line 83: | Line 117: | ||
Which means if you send a UDP packet less than 16 bytes, libslp spins in a loop trying to receive the rest of the data. | Which means if you send a UDP packet less than 16 bytes, libslp spins in a loop trying to receive the rest of the data. | ||
+ | |||
+ | === Lack of streaming new registations to a process on the same machine === | ||
+ | |||
+ | This was a must-have feature for me. I wanted a way to be updated immediately when new services appear on the network, instead of having to poll. Rather than calling SLPFindSrvs() it would be nice to have a mechanism to register interest in a particular service-type and then have a callback invoked whenever new SAs register with the DA. I could have added this functionality to openslp, but the lack of a regular releases meant that it was unlikely to make it out to user's machines by the time E1.33 shipped. |
Latest revision as of 16:37, 2 December 2012
When I started working on the E1.33 project (RDMNet) one of the first things I did was search for an open source implementation of SLP. I came across http://www.openslp.org which looked promising since it's been ported to a variety of systems, and is backed by the likes of Novell / SuSE and of course, ETC. However since testing it, I've grown more disillusioned with the project, to the point where I honestly can't recommend it to anyone looking to implement E1.33
This page has my notes on trying to get openslp working on a number of systems.
Contents
- 1 Precursor
- 2 Release vs Head
- 3 Registrations timing out too early
- 4 Race conditions during registration / deregistation
- 5 Lack of aging during retries
- 6 Default behavior is to run as root
- 7 Interface Selection on Mac OS X
- 8 Lack of Logging
- 9 Denial of Service against libslp
- 10 Lack of streaming new registations to a process on the same machine
Precursor
For many of these you could argue "that's unlikely to happen and besides SLP doesn't guarantee consistency" . While that's true, it's not particularly the stance I want to take when discussing critical show control software. If RDMNet gets a reputation for being unreliable the hard work of the task group over the last 3 years will have been wasted.
Release vs Head
The last stable release of Open SLP was 1.2.1 in 2006.
I wouldn't even bother trying to get the 1.2.1 release to work. It was only once I started running the version from HEAD that features like DAs worked correctly. Since 1.2.1 is the version in most package management systems this means we (as in the OLA team) either need to statically link against a newer version, or take up the packaging role for ubuntu, debian, macports etc.
Registrations timing out too early
slpd ages the registration database every 15 seconds (#define SLPD_AGE_INTERVAL 15) rather than tracking per registration timeouts. This means that your entry can timeout up to 15 seconds before it was supposed to.
Besides fixing the slpd code, the only way around this is to re-register less than 15 seconds before the expiry interval. For this reason I recommend the absolute minimum SLP lifetime used is 30s.
Race conditions during registration / deregistation
The way the SA code works is something like this:
- receive SrvReg from client and process it (SLPDProcessMessage) - Figure out which DAs to register with (SLPDKnownDAEcho) - Enqueue to the DA socket's sendlist (SLPDKnownDAEcho) - Send write the message to the socket (SLPDKnownDAEcho) - Call SLPDOutgoingRetry() periodically to resend any un-acked messages from the socket's send queue. - Abort the registation attempts if we've hit the MAX_RETRANSMITS limit
The problem here is that if the state of the registation changes while the registation procedure is in process, you end up with a mixture if SrvReg and SrvDereg messages. A DA which has just come up from reboot can possibly get incorrect messages about the state of a service.
Lack of aging during retries
Another problem with enqueuing the message buffer on to a socket-specific data structure is that if a retransmission event occurs, the message data isn't aged correctly. This generally works out since openslp retries quickly but it means lifetimes of services may vary across DAs.
Default behavior is to run as root
Unless you're careful about how you build slpd it runs as root. Some distros (Debian for one) appear to have fixed this.
Interface Selection on Mac OS X
On Mac, slpd relies on reverse dns for the machine's hostname returning an IP (stupid I know but that's how it is). Without reverse DNS the startup log will look like this:
Sun Jun 19 16:59:45 2011 SLPD daemon started **************************************** Command line = slpd Using configuration file = /opt/local/etc/slp.conf Using registration file = /opt/local/etc/slp.reg Listening on loopback... Multicast socket on 127.0.0.1 ready Unicast socket on 127.0.0.1 ready Agent Interfaces = 127.0.0.1 Agent URL = service:service-agent://127.0.0.1 Startup complete entering main run loop ...
If you don't have working reverse DNS for you domain, you can edit your /etc/hosts file. First get the full hostname & local address of the interface you want to use:
$ hostname simonn-macbookpro.local $ ifconfig en1 | grep "inet " | awk '{print $2}' 192.168.1.204
Then add a line like the following to /etc/hosts
192.168.1.204 simonn-macbookpro.local
Now SLP recognizes the interface correctly:
Sun Jun 19 17:03:42 2011 SLPD daemon started **************************************** Command line = slpd Using configuration file = /opt/local/etc/slp.conf Using registration file = /opt/local/etc/slp.reg Listening on loopback... Listening on 192.168.1.204 ... Multicast socket on 192.168.1.204 ready Unicast socket on 192.168.1.204 ready Agent Interfaces = 192.168.1.204 Agent URL = service:service-agent://192.168.1.204 Startup complete entering main run loop ...
Lack of Logging
The 'logging' in slpd mostly consists of printing out the sent and receives SLP messages. This isn't helpful at all when trying to figure out why an SLP client isn't working correctly.
Denial of Service against libslp
libslp has code like this:
if(FD_ISSET(sockets->sock[i],&readfds)) { /* Peek at the first 16 bytes of the header */ bytesread = recvfrom(sockets->sock[i], peek, 16, MSG_PEEK, (struct sockaddr *)peeraddr, &peeraddrlen); printf(" read %d bytes\n", bytesread); if(bytesread == 16 || ...) { } }
Which means if you send a UDP packet less than 16 bytes, libslp spins in a loop trying to receive the rest of the data.
Lack of streaming new registations to a process on the same machine
This was a must-have feature for me. I wanted a way to be updated immediately when new services appear on the network, instead of having to poll. Rather than calling SLPFindSrvs() it would be nice to have a mechanism to register interest in a particular service-type and then have a callback invoked whenever new SAs register with the DA. I could have added this functionality to openslp, but the lack of a regular releases meant that it was unlikely to make it out to user's machines by the time E1.33 shipped.