This discussionwas picked up by Gadi Evron's botnet list. The entire thread is available at http://archives.neohapsis.com/archives/sf/ids/2006-q3/0097.html
Not to steal your thunder, as you speak words of wisdom, I will mention only one thing:
Bots are very noisy and non-friendly entities online. Easy to detect. The bsame goes for C&C's.
The difference you notice is the mass "popular" attacks becoming less distinguishable as attacks and more hidden, transmuted to appear like ordinary users, which is what every attacker's goal is once he is past his kiddie days.
> Craig Chamberlain
> > -----Original Message-----
> > From: Jose Nazario
> > Sent: Tuesday, August 08, 2006 1:11 PM
> > To: mikeiscool
> > Cc: Ron Gula; email@example.com
> > Subject: Re: detecting network crowd surges
> > On Tue, 8 Aug 2006, mikeiscool wrote:
> > > I wonder, though, is this how real botnets are controlled?
> > based on our measurements and observations, IRC is the
> > dominant method for botnet control at this time. but HTTP
> > methods, similar to the ones you described, are coming on in
> > popularity. poll frequencies range from 5 seconds to 1 hour or more.
> > ________
> > jose nazario, ph.d.
> > http://monkey.org/~jose/ http://monkey.org/~jose/secnews.html
> > http://www.wormblog.com/
> > --------------------------------------------------------------
On Tue, 29 Aug 2006, Craig Chamberlain wrote:
I've seen use of HTTP by bots on the rise a bit and have seen two implementations in some detail. Much of it is fairly trivial to detect, like IRC protocol running on port 80. I've seen a couple examples I've seen were harder to spot. One was a request for a page that looked like most any normal auth form for webmail services. It was hosted on a compromised box belonging to a major website so it the traffic we had looked mostly harmless. I showed it to Marty Roesch and his people and the consensus was that it was pretty tough to write a signature against; the traffic it produced was pretty small and what we had looked pretty normal.
We ended up detecting it by the user agent which was a bit different owing to the use of some HTTP library for
Another clever example was a bot which issued a GET for a normal looking page and parsed for base64 encoded commands contained in HTML comments. There were three commands: sleep, download & execute file, and reverse shell. This isn't hard to spot once you know the pattern but there's bound to be better stuff out there.
Looking for misshapen traffic symmetry, like HTTP sessions with large outbound data streams, is one technique I've heard people have some success with. Regular expressions can spot data outbound if you're looking for structured data like account numbers. Some products also look for high outbound HTTP connection rates that are too fast to be human or HTTP sessions that cross a time threshold. Simple data volume thresholds are too easily triggered by streaming apps, in my experience, unless you consider the direction and traffic shape as in the misshapen symmetry example above.