I had a page with an obscure url that wasn't linked from any other page and no where on the web. My sitemap file didn't include this page either.
Yet I just saw that Google indexed the page and it shows up in search results.
How can that happen?
The url for this page was completely obscure like:
Google Search Engines Search Search Ranking
Google's job is to index everything it can, so they're going to take advantage of everything that isn't illegal or ethically questionable. There are about four ways that I can think of, and I'm sure Google uses them all:
1. Links between pages. The most obvious. Even though you don't think there are any pages that link to yours, it's possible that you haven't accounted for something. I had pages getting unexpectedly indexed, and later discovered that there was a special system page that listed all pages on my site.
2. Pages submitted to them via https://www.google.com/webmasters/tools/submit-url
3. Pages visited through Google tools and utilities, possibly including things like Chrome and the Google Toolbar. (Yes, there's a cost associated with free things. There's no such thing as a free lunch.)
4. Information it can glean from requests made to Google itself (like pulling the HTTP referer out of the request).
If you don't want a page indexed by Google and other search engines, the officially sanctioned way to stop this is using the robots.txt file.