Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i can't run 2.2 correctly in china #11

Open
GoogleCodeExporter opened this issue Aug 13, 2015 · 0 comments
Open

i can't run 2.2 correctly in china #11

GoogleCodeExporter opened this issue Aug 13, 2015 · 0 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1.I run this script in China:
> metagoofil.py -d swu.edu.cn -t doc -l 20 -n 20 -o test -f test.html
 output:

******************************************************
*     /\/\   ___| |_ __ _  __ _  ___   ___  / _(_) | *
*    /    \ / _ \ __/ _` |/ _` |/ _ \ / _ \| |_| | | *
*   / /\/\ \  __/ || (_| | (_| | (_) | (_) |  _| | | *
*   \/    \/\___|\__\__,_|\__, |\___/ \___/|_| |_|_| *
*                         |___/                      *
* Metagoofil Ver 2.2                                 *
* Christian Martorella                               *
* Edge-Security.com                                  *
* cmartorella_at_edge-security.com                   *
******************************************************
['doc']

[-] Starting online search...

[-] Searching for doc files, with a limit of 20
        Searching 100 results...
Results: 0 files found
Starting to download 20 of them:
----------------------------------------

processing
user
email

[+] List of users found:
--------------------------

[+] List of software found:
-----------------------------

[+] List of paths and servers found:
---------------------------------------

[+] List of e-mails found:
----------------------------


2. I tried to modify the file: discovery/googlesearch.py
change:   
self.server="www.google.com"
self.hostname="www.google.com"
to:
self.server="www.google.com.hk"
self.hostname="www.google.com.hk"

Re-run step1,
output:
....
['doc']

[-] Starting online search...

[-] Searching for doc files, with a limit of 20
_

This time, the screen output to stop in here and can not continue to go down.(I 
do not know if you can understand, i'm sorry for my poor English!)

I debugged the code and found this script execution is blocked here,i don't 
know what's happen
discovery/googlesearch.py:27  self.results = h.getfile().read()

It looks like google to return too many results


3.so I adjusted the page size parameter: 
 discovery/googlesearch.py:16 
self.quantity="100"  ===> self.quantity="10" 
 discovery/googlesearch.py:46 
self.counter+=100   ===>  self.counter+=10

and I also modified this point 
discovery/googlesearch.py:27 
     self.results = h.getfile().read()
     h.close() #Add this sentence seems to be useful

Re-run step1, Sometimes it works, sometimes the same as before

What is the expected output? What do you see instead?

it does not work very well

What version of the product are you using? On what operating system?
metagoofil 2.2 windows7 python2.6

Please provide any additional information below.

if the script run successfully, the results file path list contains some 
errors:
eg :
[1/20] /webhp?hl=en-HK
         [x] Error downloading /webhp?hl=en-HK
[12/20] /support/websearch/bin/answer.py?answer=134479
        [x] Error downloading /support/websearch/bin/answer.py?answer=134479
....

my solution is :
myparser.py:43 
#reg_urls = re.compile('<a href="(.*?)"')
reg_urls = re.compile('<a href="[^">]*?/url\?q=([^">]*?)&amp;sa=U.*?"')

 The result looks no problem, I do not know any other way, I do not want to change it again.


Original issue reported on code.google.com by [email protected] on 23 Sep 2013 at 2:45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant