[OT] Block multiple IP addresses; iptables or route...reject?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[OT] Block multiple IP addresses; iptables or route...reject?

Walter Dnes
  I have some doubts about massive "hosts" files for adblocking.  I
downloaded one that listed 13,148 sites.  I fed them through a script
that called "host" for each entry, and saved the output to a text file.
The result was 1,059 addresses.  Note that some adservers have multiple
IP address entries for the same name.  A back-of-the-envelope analysis
is that close to 95% of the entries in the large host file are invalid,
amd return "not found: 3(NXDOMAIN)".

  I'm not here to trash the people compiling the lists; the problem is
that hosts files are the wrong tool for the job.  Advertisers know about
hosts files and deliberately generate random subdomain names with short
lifetimes to invalidate the hosts files.  Every week the sites are
probably mostly renamed.  Further analysis of the 1,059 addresses show
810 unique entries, i.e. 249 duplicates.  It gets even better.  44
addresses show up in 52.84.146.xxx; I should probably block the entire
/24 with one entry.  There are multiple similar occurences, which could
be aggregated into small CIDRs.  So the number of blocking rules is
greatly reduced.

  I'm not a deep networking expert.  My question is whether I'm better
off adding iptables reject/drop rules or "reject routes", e.g...

route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject

(an example from the "route" man page).  iptables rules have to be
duplicated coming and going to catch inbound and outbound traffic.  A
reject route only needs to be entered once.  This excercise is intended
to block web adservers, so another question is how web browsers react to
route versus iptables blocking.

  While I'm at it (I did say I'm not an expert) is there another way to
handle this?  E.g. redirect "blocked CIDRs" via iptables or route to a
local pixel image?  Will that produce an immediate response by the web
browser, versus timing out with "regular blocking"?

--
Walter Dnes <[hidden email]>
I don't run "desktop environments"; I run useful applications

Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

R0b0t1
Hello,

On Wed, Oct 4, 2017 at 12:28 AM, Walter Dnes <[hidden email]> wrote:

>   I have some doubts about massive "hosts" files for adblocking.  I
> downloaded one that listed 13,148 sites.  I fed them through a script
> that called "host" for each entry, and saved the output to a text file.
> The result was 1,059 addresses.  Note that some adservers have multiple
> IP address entries for the same name.  A back-of-the-envelope analysis
> is that close to 95% of the entries in the large host file are invalid,
> amd return "not found: 3(NXDOMAIN)".
>
>   I'm not here to trash the people compiling the lists; the problem is
> that hosts files are the wrong tool for the job.  Advertisers know about
> hosts files and deliberately generate random subdomain names with short
> lifetimes to invalidate the hosts files.  Every week the sites are
> probably mostly renamed.  Further analysis of the 1,059 addresses show
> 810 unique entries, i.e. 249 duplicates.  It gets even better.  44
> addresses show up in 52.84.146.xxx; I should probably block the entire
> /24 with one entry.  There are multiple similar occurences, which could
> be aggregated into small CIDRs.  So the number of blocking rules is
> greatly reduced.
>
>   I'm not a deep networking expert.  My question is whether I'm better
> off adding iptables reject/drop rules or "reject routes", e.g...
>

If you want to filter connections based on IP, then use iptables or
the newer alternative, nftables. Nftables is faster and more
configurable.

I suggest the Wikipedia page before the documentation:
https://en.wikipedia.org/wiki/Nftables.

If you want to block advertisements, you should use a content aware
system that is integrated into a browser and that is maintained by
lots of people at the same time. You should also consider blocking
JavaScript.

Cheers,
     R0b0t1

Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

Alan McKinnon-2
In reply to this post by Walter Dnes
On 04/10/2017 07:28, Walter Dnes wrote:

>   I have some doubts about massive "hosts" files for adblocking.  I
> downloaded one that listed 13,148 sites.  I fed them through a script
> that called "host" for each entry, and saved the output to a text file.
> The result was 1,059 addresses.  Note that some adservers have multiple
> IP address entries for the same name.  A back-of-the-envelope analysis
> is that close to 95% of the entries in the large host file are invalid,
> amd return "not found: 3(NXDOMAIN)".
>
>   I'm not here to trash the people compiling the lists; the problem is
> that hosts files are the wrong tool for the job.  Advertisers know about
> hosts files and deliberately generate random subdomain names with short
> lifetimes to invalidate the hosts files.  Every week the sites are
> probably mostly renamed.  Further analysis of the 1,059 addresses show
> 810 unique entries, i.e. 249 duplicates.  It gets even better.  44
> addresses show up in 52.84.146.xxx; I should probably block the entire
> /24 with one entry.  There are multiple similar occurences, which could
> be aggregated into small CIDRs.  So the number of blocking rules is
> greatly reduced.
>
>   I'm not a deep networking expert.  My question is whether I'm better
> off adding iptables reject/drop rules or "reject routes", e.g...
>
> route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject
>
> (an example from the "route" man page).  iptables rules have to be
> duplicated coming and going to catch inbound and outbound traffic.  A
> reject route only needs to be entered once.  This excercise is intended
> to block web adservers, so another question is how web browsers react to
> route versus iptables blocking.
>
>   While I'm at it (I did say I'm not an expert) is there another way to
> handle this?  E.g. redirect "blocked CIDRs" via iptables or route to a
> local pixel image?  Will that produce an immediate response by the web
> browser, versus timing out with "regular blocking"?
>


This is a complex problem with no cut-and-dried solution. It's real life
and as you know real life is murky.

Let's define the real problem you are wanting to solve: there's a bunch
of ad servers out there, and you want them to disappear. Or more
accurately, you want their traffic to disappear from *your* wires.

There are really 3 approaches as you know:
redefine the hostname to be a blackhole (e.g. 127.0.0.1)
find the addresses or subnets and drop/reject the packets with iptables
find the subnets (sometimes the individual hosts) and route them into a
blackhole

Each has their strengths and weaknesses.
packet filters work best at the TCP/UDP/ICMP layer where you have an
addresses and often a port.
routing works best at the IP layer where you have whole chunks of
subnets and tell the router what to do with all traffic from that entire
subnet
host files work best at the name layer where you have dns names

Your problem seems to slot in somewhere between a firewall and a routing
solution, explaining why you can't decide. Host files for this sucks
major big eggs as you know, people still use it as it seems legit (but
isn't) and they understand it whereas they don't understand the other 2.

Ad providers are well aware of this. I was surprised to see
52.84.146.0/24 show up in your mail, as that is Amazon's AWS range. Yes,
you could null-route that subnet, but it's Amazon and maybe there's
hosts in there that you DO want to use.

I'd suggest you use a packet filter, but not on Linux and certainly not
iptables. That thing is a god-awful mess looking like it was built by
unsupervised schoolkids masquerading as internes. The best tool for this
is the pf packet filter, but it runs on FreeBSD. Get yourself a spare
machine, load pfsense on it (it's an appliance like wrt) and drop the
traffic from all offensive addresses. Drop, not reject.

You could in theory do the same thing with iptables, but the ruleset
will quickly drive you nuts. Perhaps the ipset plugin would help, I've
been meaning to check it out for ages and never got around to it.


--
Alan McKinnon
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Block multiple IP addresses; iptables or route...reject?

Ian Zimmerman-3
On 2017-10-04 17:21, Alan McKinnon wrote:

> I'd suggest you use a packet filter, but not on Linux and certainly not
> iptables. That thing is a god-awful mess looking like it was built by
> unsupervised schoolkids masquerading as internes. The best tool for this
> is the pf packet filter, but it runs on FreeBSD. Get yourself a spare
> machine, load pfsense on it (it's an appliance like wrt) and drop the
> traffic from all offensive addresses. Drop, not reject.

FWIW, I have considered doing what you suggest here, but the problem
with pfsense (and its fork opnsense as well) is it only runs on x86; I
think one of them won't even run on amd64, or perhaps the other way
around.  But definitely no arm at cetera, so you can't install them on a
Pi or something.

--
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
Do obvious transformation on domain to reply privately _only_ on Usenet.

Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

Lucas Ramage
In reply to this post by Alan McKinnon-2
​> The best tool for this is the pf packet filter, but it runs on FreeBSD. 

​It's too bad this still isn't around..​


On Wed, Oct 4, 2017 at 11:21 AM, Alan McKinnon <[hidden email]> wrote:
On 04/10/2017 07:28, Walter Dnes wrote:
>   I have some doubts about massive "hosts" files for adblocking.  I
> downloaded one that listed 13,148 sites.  I fed them through a script
> that called "host" for each entry, and saved the output to a text file.
> The result was 1,059 addresses.  Note that some adservers have multiple
> IP address entries for the same name.  A back-of-the-envelope analysis
> is that close to 95% of the entries in the large host file are invalid,
> amd return "not found: 3(NXDOMAIN)".
>
>   I'm not here to trash the people compiling the lists; the problem is
> that hosts files are the wrong tool for the job.  Advertisers know about
> hosts files and deliberately generate random subdomain names with short
> lifetimes to invalidate the hosts files.  Every week the sites are
> probably mostly renamed.  Further analysis of the 1,059 addresses show
> 810 unique entries, i.e. 249 duplicates.  It gets even better.  44
> addresses show up in 52.84.146.xxx; I should probably block the entire
> /24 with one entry.  There are multiple similar occurences, which could
> be aggregated into small CIDRs.  So the number of blocking rules is
> greatly reduced.
>
>   I'm not a deep networking expert.  My question is whether I'm better
> off adding iptables reject/drop rules or "reject routes", e.g...
>
> route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject
>
> (an example from the "route" man page).  iptables rules have to be
> duplicated coming and going to catch inbound and outbound traffic.  A
> reject route only needs to be entered once.  This excercise is intended
> to block web adservers, so another question is how web browsers react to
> route versus iptables blocking.
>
>   While I'm at it (I did say I'm not an expert) is there another way to
> handle this?  E.g. redirect "blocked CIDRs" via iptables or route to a
> local pixel image?  Will that produce an immediate response by the web
> browser, versus timing out with "regular blocking"?
>


This is a complex problem with no cut-and-dried solution. It's real life
and as you know real life is murky.

Let's define the real problem you are wanting to solve: there's a bunch
of ad servers out there, and you want them to disappear. Or more
accurately, you want their traffic to disappear from *your* wires.

There are really 3 approaches as you know:
redefine the hostname to be a blackhole (e.g. 127.0.0.1)
find the addresses or subnets and drop/reject the packets with iptables
find the subnets (sometimes the individual hosts) and route them into a
blackhole

Each has their strengths and weaknesses.
packet filters work best at the TCP/UDP/ICMP layer where you have an
addresses and often a port.
routing works best at the IP layer where you have whole chunks of
subnets and tell the router what to do with all traffic from that entire
subnet
host files work best at the name layer where you have dns names

Your problem seems to slot in somewhere between a firewall and a routing
solution, explaining why you can't decide. Host files for this sucks
major big eggs as you know, people still use it as it seems legit (but
isn't) and they understand it whereas they don't understand the other 2.

Ad providers are well aware of this. I was surprised to see
52.84.146.0/24 show up in your mail, as that is Amazon's AWS range. Yes,
you could null-route that subnet, but it's Amazon and maybe there's
hosts in there that you DO want to use.

I'd suggest you use a packet filter, but not on Linux and certainly not
iptables. That thing is a god-awful mess looking like it was built by
unsupervised schoolkids masquerading as internes. The best tool for this
is the pf packet filter, but it runs on FreeBSD. Get yourself a spare
machine, load pfsense on it (it's an appliance like wrt) and drop the
traffic from all offensive addresses. Drop, not reject.

You could in theory do the same thing with iptables, but the ruleset
will quickly drive you nuts. Perhaps the ipset plugin would help, I've
been meaning to check it out for ages and never got around to it.


--
Alan McKinnon
[hidden email]





--
Regards,

Visit online journal

Lucas Ramage / Software Engineer
[hidden email] / (941) 404-6794

PGP Fingerprint / Learn More
EAE7 45DF 818D 4948 DDA7 0F44 F52A 5A96 7B9B 6FB7

Visit online journal
http://lramage94.github.io

Github Linkedin

Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

Mike Gilbert-2
In reply to this post by Walter Dnes
On Wed, Oct 4, 2017 at 1:28 AM, Walter Dnes <[hidden email]> wrote:

>   I have some doubts about massive "hosts" files for adblocking.  I
> downloaded one that listed 13,148 sites.  I fed them through a script
> that called "host" for each entry, and saved the output to a text file.
> The result was 1,059 addresses.  Note that some adservers have multiple
> IP address entries for the same name.  A back-of-the-envelope analysis
> is that close to 95% of the entries in the large host file are invalid,
> amd return "not found: 3(NXDOMAIN)".
>
>   I'm not here to trash the people compiling the lists; the problem is
> that hosts files are the wrong tool for the job.  Advertisers know about
> hosts files and deliberately generate random subdomain names with short
> lifetimes to invalidate the hosts files.  Every week the sites are
> probably mostly renamed.  Further analysis of the 1,059 addresses show
> 810 unique entries, i.e. 249 duplicates.  It gets even better.  44
> addresses show up in 52.84.146.xxx; I should probably block the entire
> /24 with one entry.  There are multiple similar occurences, which could
> be aggregated into small CIDRs.  So the number of blocking rules is
> greatly reduced.
>
>   I'm not a deep networking expert.  My question is whether I'm better
> off adding iptables reject/drop rules or "reject routes", e.g...
>
> route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject
>
> (an example from the "route" man page).  iptables rules have to be
> duplicated coming and going to catch inbound and outbound traffic.  A
> reject route only needs to be entered once.  This excercise is intended
> to block web adservers, so another question is how web browsers react to
> route versus iptables blocking.

Using the routing table feels dirty.

I don't see any reason to create "inbound" (INPUT) iptables rules. You
really only care about rejecting the initial outbound request to the
web server.

If this is for a single host with iptables running locally, add rules
to the OUTPUT chain. If this is on a router, add them to the FORWARD
chain.

Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

mad.scientist.at.large

I have to disagree with the last post.  You should most certainly block some inbound traffic.  you should block ports you aren't using.  If some ip addr. or particular provider have a  customer trying to break your' machine you want to block the whole isp unless you are serving pages etc.  you should block the router solicitation and block any other routers advertising them.  i usually also block ping both ways.  Every major program is full of bugs, you want to try to limit the access of others to the least amount possible consistent with the net software you are running. 

Long ago i had all of china blocked, because i wasn't visiting sites there and it was where most of the attacks came from.  When you have a "slow" or very busy connection to the net the incursion atempts. 

While not security related directly, i also like to ban the ip addr of ad bots, i suspect that when they change their' domain name or buy a new one, that the ad company doesn't get a new ip addr range.  this are the servers that are most overloaded and slowest, slowing down page loads.  You could even consider that this slowness from ad servers produces a DOS, assuming you don't want the information and didn't ask for it.  now i just try to block the obnoxious advertisers, the people who at 3 AM will shove audio to you that's louder than the music you were/are playing.  
--
"Informed delivery" is just an excuse for the post office to compile data basses for sale to marketing firms and those even less reputable, it is a gross abuse of the postal systems special access to our lives.


4. Oct 2017 10:13 by [hidden email]:

On Wed, Oct 4, 2017 at 1:28 AM, Walter Dnes <[hidden email]> wrote:
I have some doubts about massive "hosts" files for adblocking. I
downloaded one that listed 13,148 sites. I fed them through a script
that called "host" for each entry, and saved the output to a text file.
The result was 1,059 addresses. Note that some adservers have multiple
IP address entries for the same name. A back-of-the-envelope analysis
is that close to 95% of the entries in the large host file are invalid,
amd return "not found: 3(NXDOMAIN)".

I'm not here to trash the people compiling the lists; the problem is
that hosts files are the wrong tool for the job. Advertisers know about
hosts files and deliberately generate random subdomain names with short
lifetimes to invalidate the hosts files. Every week the sites are
probably mostly renamed. Further analysis of the 1,059 addresses show
810 unique entries, i.e. 249 duplicates. It gets even better. 44
addresses show up in 52.84.146.xxx; I should probably block the entire
/24 with one entry. There are multiple similar occurences, which could
be aggregated into small CIDRs. So the number of blocking rules is
greatly reduced.

I'm not a deep networking expert. My question is whether I'm better
off adding iptables reject/drop rules or "reject routes", e.g...

route add -net 10.0.0.0 netmask 255.0.0.0 metric 1024 reject

(an example from the "route" man page). iptables rules have to be
duplicated coming and going to catch inbound and outbound traffic. A
reject route only needs to be entered once. This excercise is intended
to block web adservers, so another question is how web browsers react to
route versus iptables blocking.

Using the routing table feels dirty.

I don't see any reason to create "inbound" (INPUT) iptables rules. You
really only care about rejecting the initial outbound request to the
web server.

If this is for a single host with iptables running locally, add rules
to the OUTPUT chain. If this is on a router, add them to the FORWARD
chain.
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

Mick-10
On Wednesday, 4 October 2017 23:49:30 BST [hidden email]
wrote:

> I have to disagree with the last post.  You should most certainly block some
> inbound traffic.  you should block ports you aren't using.  If some ip
> addr. or particular provider have a  customer trying to break your' machine
> you want to block the whole isp unless you are serving pages etc.  you
> should block the router solicitation and block any other routers
> advertising them.  i usually also block ping both ways.  Every major
> program is full of bugs, you want to try to limit the access of others to
> the least amount possible consistent with the net software you are
> running.
>
> Long ago i had all of china blocked, because i wasn't visiting sites there
> and it was where most of the attacks came from.  When you have a "slow" or
> very busy connection to the net the incursion atempts.
There are a few problems with this approach:

As it has already been mentioned, the Chinese, Ukrainian, et al. IP address
blocks change on an hourly basis.

With spammers using DNS forwarding you will need to start blocking US,
Netherlands, etc. based ISPs, CDNs and cloud hosters.  However, you may still
want to receive some of these hosters content - non-malicious and non-advert
related web pages.

Some web page scripts rely on acknowledgment/interaction with servers proxied
on some of the addresses you could have blocked.  As a result web pages hang
and never complete loading, forms are broken, clicking on buttons do not yield
a result.  In other words, you could break the interwebs and your browsing
experience along with it.


> While not security related directly, i also like to ban the ip addr of ad
> bots, i suspect that when they change their' domain name or buy a new one,
> that the ad company doesn't get a new ip addr range.

Nope, the IP addresses of these change too.  They are cloud hosted too,
geographically dispersed, load balanced and change all the time.


> this are the servers
> that are most overloaded and slowest, slowing down page loads.  You could
> even consider that this slowness from ad servers produces a DOS, assuming
> you don't want the information and didn't ask for it.  now i just try to
> block the obnoxious advertisers, the people who at 3 AM will shove audio to
> you that's louder than the music you were/are playing.   --
> "Informed delivery" is just an excuse for the post office to compile data
> basses for sale to marketing firms and those even less reputable, it is a
> gross abuse of the postal systems special access to our lives.

If blocking this kind of content is for web browsing purposes only, blocking
adverts can be quite effectively achieved by using browser add ons like
'Ublock Origin'.

--
Regards,
Mick

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Block multiple IP addresses; iptables or route...reject?

Walter Dnes
On Thu, Oct 05, 2017 at 10:35:43AM +0100, Mick wrote

> There are a few problems with this approach:
>
> As it has already been mentioned, the Chinese, Ukrainian, et al. IP
> address blocks change on an hourly basis.

  Huh?!?  The subdomain names, maybe; but not the country IP address
range.  The whole point of this thread is about blocking by IP address,
not by ineffective hosts files.

> With spammers using DNS forwarding you will need to start blocking
> US, Netherlands, etc. based ISPs, CDNs and cloud hosters.

  I'll start off with /32's.  Contiguous addresses will get aggregated
into /31 and larger blocks over time.

> However, you may still want to receive some of these hosters content -
> non-malicious and non-advert related web pages.
>
> Some web page scripts rely on acknowledgment/interaction with servers
> proxied on some of the addresses you could have blocked.  As a result
> web pages hang and never complete loading, forms are broken, clicking
> on buttons do not yield a result.  In other words, you could break
> the interwebs and your browsing experience along with it.

  This battle has already been fought on the spam email front.  Some
greedy ISPs decided to make extra money by taking on egregious spammers
and using legitimate customers as "human shields".  That didn't work.

--
Walter Dnes <[hidden email]>
I don't run "desktop environments"; I run useful applications