How do we scrape a SERP?
Scraping a SERP means obtaining the top 10 or so results from a Google search for any word and putting them in a document.
A recent Flat 101 project called for us to repeatedly obtain the first 100 results of a Google search for a series of keywords.
What could we do to speed up the process as much as possible?
There are several different approaches that are perfectly valid, although I should warn you before you try to use them yourself: be careful when performing continuous scraping of different pages because Google can detect you and bombard you with captchas for an indefinite period, so it is a good idea to buy proxys and use them for this process.
As I am not a technician, I only use external tools that can make my work easier without having to bother my colleagues in development.
Tool 1: Mozbar for your browser
Using a browser add-on can be a quick solution if you don’t need to handle too many keywords or a large number of results.
Mozbar offers you an infinite amount of SEO information, but also the option to export it in a CSV:
If we use “Export to CSV”, it will automatically send an Excel file to our computer with the first 10 results of our search in a comma separated format.
Pro Tip: To obtain more than 10 results, go to Google configuration and in set the default option for showing results under “search settings” to either 10, 20, 30, 40, 50 or 100.
Tool 2: Oscraper for Chrome
Oscraper is another browser extension whose free version offers the same function as Mozbar. It also has a version that costs 17 dollars which gives you a series of editable options to fine tune the resulting document:
Extract the URL with or without http
Extract the domain with or without www
Extract only the domain without the internal folders
Extract the URLs of the AdWords ads as well
Exclude domains that you are not interested in
Tool 3: Simple SERP Scraper
Once we have installed the tool on our machine, either MAC or Windows, we can see the main options:
Which version of Google we want to extract
How many results
How much delay we want between searches
Finally, we can enter the keywords we want to measure and we are ready to roll. The option to use our proxies to avoid potential problems is clearly displayed.
Do you know any other way? Maybe with an import from Excel? Why not make your contribution…