Yesterday, I talked about how to create a google parsing script. if you missed it, you can see the previous article here.
This was the initial plan for development my script for scrapping google:
search for google for ‘moving company name’
find 100 google links for the google company name.
go to each website, get all links for each website and store them in the collection.
go to collection, and for each link search for specific regex, and extract to file.
//list of links.
save link to DB
for each run.
check if the link was updated, if not, add a change.
I’m thinking about some ideas on what I could use the script for:
maybe: create a DB entry for each searched keyword, so I can re-check the status of what was previously searched for.
maybe: create a chrome plugin, that gets the info that you’re searching for and stores it to a local DB. with search results.
maybe: have a database full of sites, and go through each one of them, get all links for each one, then search for a regex (example email)
maybe: create a live stream with when I’m working?
from the first part as well:
For messing around purposes, I have a couple of ideas on what to do with this:
-use it as a google incognito custom results searcher(I can customize this as I wish – however, I do like how google does it)
-use it for storing data. I should also store the result position in google and the timestamp. adding this as a todo.
(adding the timestamp, might help me to find out on which position a specific keyword was in google. )
-make it run in less than 2 seconds (4.7 = +1.7 seconds it’s the result from google + currently my run is about +-3 seconds)
-post results available through API method and public URL
-a google API
-a keyword ranking indexing tracker (that would be linked to my WordPress maybe?)
-a backend replacement for google, so I can track what I search for, and maybe analyze it later.
Later edit: 21/04/2021
it’s difficult to group links by each section.
How about we store all links, and we scrape for the other data, later?
so far, I had a script that looks for a specific sentence in x no of pages in google, stores each website to a db for later use/parsing.
I’m thinking that I can build a database of email addresses with companies details, and I can reach out for various service implementations.
What if I ask for a need?