httrack.md 7.1 KB


title: HTTRack course: intro_pentest section: Reconnaissance

layout: lesson

Typically, we begin the first step by closely reviewing the target’s website. In some cases, we may use a tool called HTTrack to make a page-by-page copy of the website. HTTrack is a free utility that creates an identical, off-line copy of the target website. The copied website will include all the pages, links, pictures and code form the original website; however, it’ll reside on your local computer. Utilizing a website copying tool like HTTrack allows us to explore and thoroughly mine the website “off-line” without having to spend additional time traipsing around on the company’s web server.

It’s important to understand that the more time you spend navigating and exploring the target website, the more likely it’s that your target can be tracked or traced (even if you’re simply browsing the site). Remember, anytime you interact directly with a resource owned by the target, there’s a chance that you’ll leave a digital fingerprint behind.

Advanced penetration testers can also run automated tools to extract additional or hidden information from a local copy of a website.

HTTrack can be downloaded directly from the company’s website at: http://www.httrack.com. Installing for Windows is as simple as downloading the installer .exe and clicking next. HTTrack is prebuilt into BlackArch, there are two ways to start HTTrack, the graphical one and the in-terminal one, to run the graphical one, it’s just necessary to issue “webhttrack” in a console, and the issued command will spawn a browser, and if you want to do it through the terminal, all you have to do is to run “httrack” in the terminal.

After we installed the program, we need to run it against our target. Please be aware that this activity is easy to trace and considered highly offensive. Never run this tool without prior authorization. Once HTTrack is started, we’re presented with several webpages (in case you’ve chosen the graphical way) that allow us to set up and customize the copy process. Each page allows us to change various aspects of the program including language (English is the default), project name, the location where we’ll store the copied website and the web address of the site you’d like to copy. You can work your way through each of these pages by making the desired changes to each option and clicking the “Next” button. The final page will include a “Start” button, click this when you’re ready to begin making a copy of your target’s website. The amount of time it takes for this process to complete will depend on the size of the target website. Once HTTrack has finished copying the target website, it’ll present you with a webpage allowing you to “Browse the Mirrored Website” in a browser or navigate to the path where the site was stored.

Whether you make a copy of the target website or you simply browse the target in realtime, it’s important to pay attention to details. You should begin by closely reviewing and recording all the information you find on the target’s website. Oftentimes, with very little digging you’ll be able to make some significant findings including physical address and locations, phone numbers, e-mail addresses, hours of operation, business relationships (partnerships), employee names, social media connections and other public tidbits.

Oftentimes when conducting a penetration test, it’s important to pay special attention to things like “News” or “Announcements”. Companies are often proud of their achievements and unintentionally leak useful information through these stories. Company mergers and acquisitions can also yield valuable data; this is especially important for expanding the scope and adding additional targets to our penetration test. Even the smoothest of acquisitions creates change and disarray in an organization. There’s always a transition period when companies merge. This transition period provides us value by giving us additional targets. Merged or sibling companies should be authorized and included in the original target list, as they provide a potential gateway into the organization.

Finally, it’s important to search and review any open job posting for the target company. Job postings often reveal very detailed information about the technology being used by an organization. Many times you’ll find specific hardware and software listed on the job opening. Don’t forget to search for your target in the nationwide job banks as well. For example, assume you come across a job requisition looking for a Network Administrator with Cisco ASA experience. From this post, you can draw some immediate conclusions and make some educated guesses. First, you can be certain that the company either uses or is about to use, a Cisco ASA firewall. Second, depending on the size of the organization, you may be able to infer that the company doesn’t have, or is about to lose, someone with knowledge of how to properly use and configure a Cisco ASA firewall. In either case, you’ve gained valuable knowledge about the technology in place.

In most cases, once we’ve thoroughly examined the target’s website, we should have a solid understanding of the target including who they are, what do they do and where they are located.

Armed with this basic information about the target, we move into passive reconnaissance. It’s very difficult, if not impossible, for a company to determine when a hacker or penetration tester is conducting passive reconnaissance. This activity offers a low-risk, high-reward situation for attackers. Recall that passive reconnaissance is conducted without ever sending a single packet to the target system. Our weapon of choice to perform this task is the Internet. We begin by performing exhaustive searches of our target in the various search engines available.

Although there are many great search engines available today, when covering the basics of hacking and penetration testing, we’ll focus on Google. Google is very, very good at its job. There’s a reason why the company’s stock trades for $1,384 a share (as the date of this is written). Spiders from the company aggressively and repeatedly scour all corners of the Internet cataloguing information and send it back to Google. The company is so efficient at its job, that oftentimes hackers can perform an entire penetration test using nothing but Google.

Although we wouldn’t dive into the specifics of Google Hacking, a solid understanding of how to properly use Google is vital to becoming a skilled penetration tester. If you ask people, “How do you use Google?” they typically respond by saying, “Well it’s simple… You just spawn a web browse, navigate to Google and type what you’re searching for in the box”.

Although this answer is fine for 99 per cent of the planet, it ain’t enough for aspiring hackers. You’ve to learn to search in a smarter way and miximize the return results. Learning how to properly use a search engine like Google will save you time and allow you find the hidden gems that are buried in the trillions of web pages on the Internet today.