Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 2

Check websites and HTML documents for broken links. * recursive and multithreaded checking * output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats * HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support * restriction of link ... [More] checking with regular expression filters for URLs * proxy support * username/password authorization for HTTP and FTP and Telnet [Less]

3.0
   
  0 reviews  |  2 users  |  101 lines of code  |  1 current contributor  |  Analyzed 2 days ago
 
 

OpenWebSpider - The Open Source Web Spider And Search Engine The OpenWebSpider project was born from the idea that internet is free and all informations must be freely available for all users! Using all free software and being Open Source, OpenWebSpider would be the base for a new Search engine ... [More] developed from a comunity of opensource developers! [Less]

4.0
   
  0 reviews  |  2 users  |  0 current contributors
 
 
Compare

PUMz is web clipping service project and upgrade project of pumware(pumware.sf.net).

0
 
  0 reviews  |  1 user  |  43,670 lines of code  |  0 current contributors  |  Analyzed 1 day ago
 
 

WebChuan is a set of open source libraries and tools for getting and parsing web pages of website. It is written in Python, based on Twisted and lxml. It is inspired by GStreamer. WebChuan is designed to be back-end of web-bot, it is easy to use, powerful, flexible, reusable and efficient.

0
 
  0 reviews  |  1 user  |  628 lines of code  |  0 current contributors  |  Analyzed 9 months ago
 
 

Spidr is a versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

0
 
  0 reviews  |  1 user  |  2,065 lines of code  |  2 current contributors  |  Analyzed over 1 year ago
 
 

Aranya is spider, using distributed architecture. this project is to complete a safe, efficient, and Configurable Internet information collection system, through the profile, it can provide effective data(pages, photos, etc.) for many kinds of search engines.

0
 
  0 reviews  |  0 users  |  0 current contributors  |  Analyzed 9 days ago
 
 

Because the contrains of some database providers, sometimes it is hard to download the book. With this framework you can make it as a "private local library". Also, it is very convenice to search. http://code.google.com/p/harvestman-crawler/

0
 
  0 reviews  |  0 users  |  0 current contributors
 
 

Web robot (spider) writen in php with curl. The goal, is index some webs of products , to find the cheapest.

0
 
  0 reviews  |  0 users  |  91 lines of code  |  0 current contributors  |  Analyzed 1 day ago
 
 

this is a spider used to crawl webpages from the internet. urls.py: used at the server side collect urls sent from the client to avoid the webpages overloaded send urls to the client spider.py: used at the client side get the urls sent by the server crawl web pages analysis webpages ... [More] and get the url send the urls to the server, the server judges if the urls have been downloaded before [Less]

0
 
  0 reviews  |  0 users  |  182 lines of code  |  0 current contributors  |  Analyzed 5 days ago
 
 

Main project: Mapping,discovering relations and mining conclusive data from social networks. There are various other projects meant as utilities or code that can be re-used in some projects.

0
 
  0 reviews  |  0 users  |  30,586 lines of code  |  1 current contributor  |  Analyzed about 1 year ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.