Groups > EMAIL > Spamcop > Re: lotsa wasted space




lotsa wasted space

lotsa wasted space
Tue, 25 Mar 2008 21:37:14 -080
http://www.spamcop.net/sc?id=z1747196767zef0431747cadf8b06ab6a4bd75575aedz

5100 lines - just about choked AVG, 20 secs to scan - SC parser just burped.
never seen this before myself.
Post Reply
Re: lotsa wasted space
Wed, 26 Mar 2008 09:44:27 -080
On 3/26/2008 7:17 AM Sofa King Tyred of Lar Ting scribbled:

> jg wrote:
>>
http://www.spamcop.net/sc?id=z1747196767zef0431747cadf8b06ab6a4bd75575aedz
>>
>> 5100 lines - just about choked AVG, 20 secs to scan - SC parser just
burped.
>> never seen this before myself.
>> spammers are getting downright scary as well as flaky...
> 
> Yabbut - IE won't burp on it, and the spammers know that. This is clever 
> on the part of spammers.

clever just how?  I suspect some skulduggery but don't see what...

> 
> I advocate using IE and DOM (document object model) parsing instead of 
> writing code to spams at the HTML level - I have a couple of prototypes, 
> but still don't have a web service for this yet :-)

Post Reply
Re: lotsa wasted space
Wed, 26 Mar 2008 11:17:00 -040
jg wrote:
> http://www.spamcop.net/sc?id=z1747196767zef0431747cadf8b06ab6a4bd75575aedz
> 
> 5100 lines - just about choked AVG, 20 secs to scan - SC parser just
burped.
> never seen this before myself.
> spammers are getting downright scary as well as flaky...

Yabbut - IE won't burp on it, and the spammers know that. This is clever 
on the part of spammers.

I advocate using IE and DOM (document object model) parsing instead of 
writing code to spams at the HTML level - I have a couple of prototypes, 
Post Reply
Re: lotsa wasted space
Wed, 26 Mar 2008 12:55:39 -040
jg wrote:
> 
> On 3/26/2008 7:17 AM Sofa King Tyred of Lar Ting scribbled:
> 
> > jg wrote:
> >>
http://www.spamcop.net/sc?id=z1747196767zef0431747cadf8b06ab6a4bd75575aedz
> >>
> >> 5100 lines - just about choked AVG, 20 secs to scan - SC parser
just burped.
> >> never seen this before myself.
> >> spammers are getting downright scary as well as flaky...
> >
> > Yabbut - IE won't burp on it, and the spammers know that. This is
clever
> > on the part of spammers.
> 
> clever just how?  I suspect some skulduggery but don't see what...

Well, it could just be a simple case of "cut-and-paste" a bunch of
times, with each time double-spacing the previous version.

Or, it could be a case of someone figuring out that IE will display
such things as the spammer wants, but brings spam/virus filters to
slow to a crawl.  ("Gee, this AVG program is slowing down my system,
and turning it off speeds things up, so I think I'll just keep it
off.")

> > I advocate using IE and DOM (document object model) parsing instead
of
> > writing code to spams at the HTML level - I have a couple of
prototypes,
> > but still don't have a web service for this yet :-)
> 
> ok, I'll pretend to understand that :-)

I think he may have left off a few words.

Perhaps it means:

    Rather than spam/virus filters parsing the raw HTML, allow IE to
    parse it, and then use DOM to scan the parsed document.

-- 
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody        | www.hvcomputer.com | #include              |
| kenbrody/at\spamcop.net | www.fptech.com     |    <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:ThisIsASpamTrap@gmail.com>

Post Reply
Re: lotsa wasted space
Wed, 26 Mar 2008 16:52:32 -040
jg wrote:
> On 3/26/2008 7:17 AM Sofa King Tyred of Lar Ting scribbled:
>> I advocate using IE and DOM (document object model) parsing instead of

>> writing code to spams at the HTML level - I have a couple of
prototypes, 
>> but still don't have a web service for this yet :-)
> 
> ok, I'll pretend to understand that :-)

My bad - I *did* leave out a word (as Kenneth said in his reply). I 
meant to write: "instead of writing code to _parse_ spams at the HTML 
level".

There was a recent thread that mentioned how SC was truncating the phish 
URLs, but that likely it's too hard to fix the parser because it's a 
monster.

Also, SC will bail entirely if there's JavaScript in the spam (it's 
understandable, who wants to re-invent a JavaScript engine?).

Both of these problems would not exist if you let IE render the HTML 
from a spam and then parse the DOM for link objects. Feed that to the 
URL processor of SC... Like I said, I'm still working on the web service 
Post Reply
about | contact