Google doesn’t find every page on web. Search engines start at some popular spots and repeatedly follow links from there, “crawling” their way across the web. A new website that nobody links to will not be found unless its admins explicitly submit it to Google, asking for it to be crawled.
How Search Engines Works A simple and Powerful explanation
Google gets to know about any new website through these 3 sources;
Domain Discovery: Google DNS: Almost every time you visit a website, it needs to get the IP address for the website. Google DNS is very popular DNS around the world, DNS logs are very useful for discovering domains.
Domain Registrars: hostgator, hostinger, inmotion and other domain registrars
Web Page Discovery through: Google Toolbar / Google Omnibox / Mozilla Suggestions / IE Suggestions:
Google/Bing make very heavy use of toolbar/omnibox data. Whenever a user visits the page, the request is logged by the browser/toolbar.
Browser/Toolbar logs are very rich source of signals for URL discovery and ranking. Assuming a page is visited by at least one person, the creator, Google can discover it from the logs.
Partners: Sitemap.xml/ RSS feed
Website owners can communicate the structure and orphan pages of the website to search engine using sitemap.xml.
How Google search engine works? Explanation by Matt Cutts