filtron.rst 5.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192
  1. .. _searx filtron:
  2. ==========================
  3. How to protect an instance
  4. ==========================
  5. .. sidebar:: further reading
  6. - :ref:`filtron.sh`
  7. - :ref:`nginx searx site`
  8. .. contents:: Contents
  9. :depth: 2
  10. :local:
  11. :backlinks: entry
  12. .. _filtron: https://github.com/asciimoo/filtron
  13. Searx depends on external search services. To avoid the abuse of these services
  14. it is advised to limit the number of requests processed by searx.
  15. An application firewall, filtron_ solves exactly this problem. Filtron is just
  16. a middleware between your web server (nginx, apache, ...) and searx, we describe
  17. such infratructures in chapter: :ref:`architecture`.
  18. filtron & go
  19. ============
  20. .. _Go: https://golang.org/
  21. .. _filtron README: https://github.com/asciimoo/filtron/blob/master/README.md
  22. Filtron needs Go_ installed. If Go_ is preinstalled, filtron_ is simply
  23. installed by ``go get`` package management (see `filtron README`_). If you use
  24. filtron as middleware, a more isolated setup is recommended. To simplify such
  25. an installation and the maintenance of, use our script :ref:`filtron.sh`.
  26. .. _Sample configuration of filtron:
  27. Sample configuration of filtron
  28. ===============================
  29. .. sidebar:: Tooling box
  30. - :origin:`/etc/filtron/rules.json <utils/templates/etc/filtron/rules.json>`
  31. An example configuration can be find below. This configuration limits the access
  32. of:
  33. - scripts or applications (roboagent limit)
  34. - webcrawlers (botlimit)
  35. - IPs which send too many requests (IP limit)
  36. - too many json, csv, etc. requests (rss/json limit)
  37. - the same UserAgent of if too many requests (useragent limit)
  38. .. code:: json
  39. [
  40. {
  41. "name": "search request",
  42. "filters": [
  43. "Param:q",
  44. "Path=^(/|/search)$"
  45. ],
  46. "interval": "<time-interval-in-sec (int)>",
  47. "limit": "<max-request-number-in-interval (int)>",
  48. "subrules": [
  49. {
  50. "name": "missing Accept-Language",
  51. "filters": ["!Header:Accept-Language"],
  52. "limit": "<max-request-number-in-interval (int)>",
  53. "stop": true,
  54. "actions": [
  55. {"name":"log"},
  56. {"name": "block",
  57. "params": {"message": "Rate limit exceeded"}}
  58. ]
  59. },
  60. {
  61. "name": "suspiciously Connection=close header",
  62. "filters": ["Header:Connection=close"],
  63. "limit": "<max-request-number-in-interval (int)>",
  64. "stop": true,
  65. "actions": [
  66. {"name":"log"},
  67. {"name": "block",
  68. "params": {"message": "Rate limit exceeded"}}
  69. ]
  70. },
  71. {
  72. "name": "IP limit",
  73. "interval": "<time-interval-in-sec (int)>",
  74. "limit": "<max-request-number-in-interval (int)>",
  75. "stop": true,
  76. "aggregations": [
  77. "Header:X-Forwarded-For"
  78. ],
  79. "actions": [
  80. { "name": "log"},
  81. { "name": "block",
  82. "params": {
  83. "message": "Rate limit exceeded"
  84. }
  85. }
  86. ]
  87. },
  88. {
  89. "name": "rss/json limit",
  90. "filters": [
  91. "Param:format=(csv|json|rss)"
  92. ],
  93. "interval": "<time-interval-in-sec (int)>",
  94. "limit": "<max-request-number-in-interval (int)>",
  95. "stop": true,
  96. "actions": [
  97. { "name": "log"},
  98. { "name": "block",
  99. "params": {
  100. "message": "Rate limit exceeded"
  101. }
  102. }
  103. ]
  104. },
  105. {
  106. "name": "useragent limit",
  107. "interval": "<time-interval-in-sec (int)>",
  108. "limit": "<max-request-number-in-interval (int)>",
  109. "aggregations": [
  110. "Header:User-Agent"
  111. ],
  112. "actions": [
  113. { "name": "log"},
  114. { "name": "block",
  115. "params": {
  116. "message": "Rate limit exceeded"
  117. }
  118. }
  119. ]
  120. }
  121. ]
  122. }
  123. ]
  124. .. _filtron route request:
  125. Route request through filtron
  126. =============================
  127. .. sidebar:: further reading
  128. - :ref:`filtron.sh overview`
  129. - :ref:`installation nginx`
  130. - :ref:`installation apache`
  131. Filtron can be started using the following command:
  132. .. code:: sh
  133. $ filtron -rules rules.json
  134. It listens on ``127.0.0.1:4004`` and forwards filtered requests to
  135. ``127.0.0.1:8888`` by default.
  136. Use it along with ``nginx`` with the following example configuration.
  137. .. code:: nginx
  138. # https://example.org/searx
  139. location /searx {
  140. proxy_pass http://127.0.0.1:4004/;
  141. proxy_set_header Host $host;
  142. proxy_set_header Connection $http_connection;
  143. proxy_set_header X-Real-IP $remote_addr;
  144. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  145. proxy_set_header X-Scheme $scheme;
  146. proxy_set_header X-Script-Name /searx;
  147. }
  148. location /searx/static {
  149. /usr/local/searx/searx-src/searx/static;
  150. }
  151. Requests are coming from port 4004 going through filtron and then forwarded to
  152. port 8888 where a searx is being run. For a complete setup see: :ref:`nginx
  153. searx site`.