1. If Google detects something as malware, i.e. google software knows that it can be dangerous to users, then why it cannot prevent itself from acting as intermediary? Also, why it does not stop hosting malware?
2. >>> Malicious software is hosted on 279 domain(s), including 24corp-shop.com/, abu-farhan.com/, soaksoak.ru/.
These web domains do not belong to Google. It seems google is downloading several pages onto its server for various purposes. Is it legal in all countries?
From the architecture point of view, is it difficult to sandbox/protect user facing google.com search engine from the above websites all the time so that if malware is there, do not let it effect search engine or other major parts. Users are not security-literate.
3. What should I do as user? Just ignore this assuming that this is for webmasters and not for ordinary users?
Honestly, for me personally, malware on google is unimaginable, since we consider it as gold standard on the web.
It's important for us (I work at Google on web-search) to be transparent about these reports, and we use them to remove / block content that is malicious too (just like other sites can use the Safe-Browsing API to get information about sites they host). With regards to where it's hosted, there are two main elements involved: a site that actually hosts the exploit (which could be a Windows EXE file, etc), and a site that sends the user to that exploit. Often these are separate. Sometimes it's not even a direct embedding of a known malicious site, for example, it could be that a counter/analytics-tracking site is hacked, which could result in all other sites that use those counters/scripts unknowningly sending users to malicious content.
From talking with webmasters, I have seen almost no false-positives in this flagging, but it's sometimes very hard to find the actual exploit. It sometimes hides from some visitors (direct visitors - like the webmaster - might not see it, it might only be visible for those coming from search), sometimes is limited to geographies or devices. This makes finding the exploit hard sometimes, and fixing the website so that it's no longer vulnerable to the attack that dropped the exploit isn't easy in many cases either.
I take these warnings very seriously when I see them in the browser, even when accessing a site with a fairly locked-down & up-to-date browser. I would recommend never skipping them, even to diagnose an issue (use other tools for that).
My theory: People search for stuff on Google. The search results page has a result with a download from abu-farhan.com. People click that link on the search results page, the download starts. Now google.com has "hosted" a malware download.
Keep in mind that this is not reporting malware is currently present, it is reporting that at the last time it checked it found malware, which may have been taken down since then. It doesn't tell you anything about how long it stayed up.
> These web domains do not belong to Google. It seems google is downloading several pages onto its server for various purposes.
I have no specific knowledge of this, but my guess would be that these are just the targets of links.
Would some kind soul please describe to me what this does, my corporate eager beaver network admins seem to consider this some kind of problem site and it's URL is blocked by our gateway proxy.
Indented and hard-wrapped, for your viewing pleasure:
Safe Browsing Diagnostic page for google.com
What is the current listing status for google.com?
This site is not currently listed as suspicious.
Part of this site was listed for suspicious activity 12 time(s) over the
past 90 days.
What happened when Google visited this site?
Of the 6815255 pages we tested on the site over the past 90 days, 1686
page(s) resulted in malicious software being downloaded and installed
without user consent. The last time Google visited this site was on
2015-01-22, and the last time suspicious content was found on this site was
on 2015-01-22.
Malicious software includes 139894 exploit(s), 2748 trojan(s), 502 virus.
Successful infection resulted in an average of 5 new process(es) on the
target machine.
Malicious software is hosted on 275 domain(s), including 24corp-shop.com/,
abu-farhan.com/, soaksoak.ru/.
296 domain(s) appear to be functioning as intermediaries for distributing
malware to visitors of this site, including southeastasianarchaeology.com/,
thesmallbusinessplaybook.com/, impots-economie.com/.
This site was hosted on 3 network(s) including AS36040 (YOUTUBE), AS43515
(YOUTUBE), AS15169 (GOOGLE).
Has this site acted as an intermediary resulting in further distribution of malware?
Over the past 90 days, google.com appeared to function as an intermediary
for the infection of 528 site(s) including s3.amazonaws.com/lowlordyok/,
s3.amazonaws.com/fann21ahsdc/, s3.amazonaws.com/skcfb01kpl/.
Has this site hosted malware?
Yes, this site has hosted malicious software over the past 90 days. It
infected 22 domain(s), including burguscircus.free.fr/,
plus.google.com/112502198606472559837/, beljews.info/.
Next steps:
Return to the previous page.
If you are the owner of this web site, you can request a review of your
site using Google Webmaster Tools. More information about the review
process is available in Google's Webmaster Help Center.
What is the current listing status for google.com?
This site is not currently listed as suspicious.
Part of this site was listed for suspicious activity 12 time(s) over the past 90 days.
What happened when Google visited this site?
Of the 6815255 pages we tested on the site over the past 90 days, 1686 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 2015-01-22, and the last time suspicious content was found on this site was on 2015-01-22.
Malicious software includes 139894 exploit(s), 2748 trojan(s), 502 virus. Successful infection resulted in an average of 5 new process(es) on the target machine.
Malicious software is hosted on 275 domain(s), including 24corp-shop.com/, abu-farhan.com/, soaksoak.ru/.
296 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site, including southeastasianarchaeology.com/, thesmallbusinessplaybook.com/, impots-economie.com/.
This site was hosted on 3 network(s) including AS36040 (YOUTUBE), AS43515 (YOUTUBE), AS15169 (GOOGLE).
Has this site acted as an intermediary resulting in further distribution of malware?
Over the past 90 days, google.com appeared to function as an intermediary for the infection of 528 site(s) including s3.amazonaws.com/lowlordyok/, s3.amazonaws.com/fann21ahsdc/, s3.amazonaws.com/skcfb01kpl/.
Has this site hosted malware?
Yes, this site has hosted malicious software over the past 90 days. It infected 22 domain(s), including burguscircus.free.fr/, plus.google.com/112502198606472559837/, beljews.info/.
Next steps:
Return to the previous page.
If you are the owner of this web site, you can request a review of your site using Google Webmaster Tools. More information about the review process is available in Google's Webmaster Help Center.
"Would some kind soul please describe to me what this does, my corporate eager beaver network admins seem to consider this some kind of problem site and it's URL is BLOCKED by our gateway proxy."
AS numbers is part of the BGP protocol, when you are a large organization with multiple presence points on the internet you need to advertize your prefixes (routes), i.e. the IP blocks that you host behind your routers, and to do that you need to be an "Autonomous System" and to be one you need to register with IANA (it costs $500 I think and you need to prove that you actually need one) and you get an AS number. The techincal details are here: https://tools.ietf.org/html/rfc4271
AS Numbers are how you announce the prefixes where content C resides e.g. 1.0.0.0/24 to the Internet, and you (through your provider announcing your prefix to the Internet) find your path to the content C for the services.
For example gmail is on google.com, so is google drive, google transalte (I think this might be a big one), and various other services that host user content.
Since a couple of years, google redirects you when you click on a hit URL via google.com/something (you only notice this on a slow connection like an EDGE/2G network).
It might very well be that the malware scanner picked up a link to such a "redirector" which leads to malware and then took the TLD google.com for malicious.
Another reason why one should never ever host user-generated files (or links/redirects) on the primary domain. Github did this with github.io for the same reason.
I wonder if malicious plugins are modifying the page and injecting things? I've encountered ones that physically change Google results pages, it looks like the page came from Google but the results are from some scam network.
The diagnostic page doesn't appear to always be strictly accurate. For instance, it says "Google has not visited this site within the past 90 days." for many of my sites which it has crawled daily for years.
I assume there are different levels of data collected for different types of visits. For example, Google may just collect data for the PageRank algorithm (i.e. your pages have been visited by google.com) or they may also collect safebrowsing/diagnostic data (i.e. your pages have been visited by google.com andhttp://www.google.com/safebrowsing/diagnostic).
"Of the 10 pages we tested on the site over the past 90 days, 0 page(s) resulted in malicious software being downloaded and installed without user consent."
On the one hand, I feel smugly better about using another browser[1]. But how could I feel this without google (i.e. google.com/safebrowsing/diagnostic) to provide the ammunition? I'm so confused now.
It says they visited 6,815,255 Google sites and 1,686 contained malware. It only visited 152 DuckDuckGo pages. It would have to visit an order of magnitude more than 152 Google pages in order to expect one of them to contain malware.
As far as I know, DDG does just browser/search. By extension they shouldn't be able to do as much harm as a company that provides many more services (e.g., safebrowsing/diagnostic). [edited the following 2 sentences for readability.] For example, one area possible threat could be code.google.com. DDG doesn't have a counterpart of hosted code, so it can't possibly be a threat.
Hence, my confusion. Yes, part of me was just trying to be humorously sarcastic. But part of me really enjoys some of the innovative ways that Google leverages data. And yet, another part of my thinks that they have stepped over important privacy and security lines in other areas. Hence, I prefer DDG for the vast majority of my search needs.
Is it the ads? Most malware is distributed through ads. I don't think there's a risk in AdWords text based ads, but the display ads frequently include malicious software. That and download.com. I'm glad I use the ad blocker.
I'm OCD and a slight perfectionist, so when I use this on my site I spend hours making sure I have not a single issue. After seeing that! I would go insane.
Makes you wonder how come M$ doesn't make their site more compatible? As a multi-billion dollar company, they should have higher standards and meet the W3 standards.
1. If Google detects something as malware, i.e. google software knows that it can be dangerous to users, then why it cannot prevent itself from acting as intermediary? Also, why it does not stop hosting malware?
2. >>> Malicious software is hosted on 279 domain(s), including 24corp-shop.com/, abu-farhan.com/, soaksoak.ru/.
These web domains do not belong to Google. It seems google is downloading several pages onto its server for various purposes. Is it legal in all countries?
From the architecture point of view, is it difficult to sandbox/protect user facing google.com search engine from the above websites all the time so that if malware is there, do not let it effect search engine or other major parts. Users are not security-literate.
3. What should I do as user? Just ignore this assuming that this is for webmasters and not for ordinary users?
Honestly, for me personally, malware on google is unimaginable, since we consider it as gold standard on the web.