Projects tagged ‘searchengine’


[22 total ]

13USERS
   

The ht://Dig system is a complete WWW indexing and searching system for a domain or intranet. This system is not meant to replace the need for internet-wide search systems like Lycos, Infoseek, Google, and AltaVista. Instead, it is meant to cover the search needs for a single company, campus, or even a particular sub-section of a Web site.

13USERS
   

GeoNetwork opensource is a standardized and decentralized spatial information management environment, designed to enable access to geo-referenced databases, cartographic products and related metadata from a variety of sources, enhancing the spatial ... [More] information exchange and sharing between organizations and their audience, using the capacities of the internet. This approach of geographic information management aims at facilitating a wide community of spatial information users to have easy and timely access to available spatial data and to existing thematic maps that might support informed decision making. [Less]

10USERS
   

Xapian is an Open Source Search Engine Library, released under the GPL. It's written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C#, and Ruby (so far!) Xapian is a highly adaptable toolkit which allows developers to ... [More] easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators. Note that ohloh's automated summary below is confused by some ".st" data files which it assumes are uncommented smalltalk. So Xapian isn't written in smalltalk at all (it's mostly C++), and is actually quite well commented (27.6% comments compared to ohloh's 22% average for C++ projects). [Less]

2USERS
   

OpenWebSpider - The Open Source Web Spider And Search Engine The OpenWebSpider project was born from the idea that internet is free and all informations must be freely available for all users! Using all free software and being Open Source ... [More] , OpenWebSpider would be the base for a new Search engine developed from a comunity of opensource developers! [Less]

2USERS

TouchGraph is a set of interfaces for graph visualization using spring-layout and focus+context techniques. Current applications include a utility for organizing links, a visual Wiki Browser, and a Google Graph Browser which uses the Google API.

2USERS
   

Compass is an open source project built on top of Lucene aiming at simplifying the integration of search into any Java application.

2USERS

A flexible metadata database that utilizes XML as a common syntax for representing the large number of metadata content standards relevant to ecology. Thus, Metacat is a generic XML database that allows storage, query, and retrieval of arbitrary XML documents without prior knowledge of the XML schema.

2USERS

Kiwix is a free software developed to release an offline version of Wikipedia. It is also a software plateform to help the communauty to build selections.

2USERS

The "xappy" python module is an easy-to-use interface to the Xapian search engine. Xapian provides a low level interface, dealing with terms and documents, but not really worrying about where terms come from, or how to build searches to match the way ... [More] in which data has been indexed. In contrast, "xappy" allows you to design a field structure, specifying what kind of information is held in particular fields, and then uses this field structure to index data appropriately, and to build and perform searches. [Less]

1USERS

eZ Find is an enterprise-ready search plugin for eZ Publish, making it possible to search multiple eZ Publish installations simultaneously.

1USERS

mnoGoSearch is a full-featured SQL based web search engine.

1USERS
 

Flax is a project to develop an open source enterprise search engine application based on the Xapian search engine library. It also contains a clean-and-simple Python interface suitable for many users of Xapian, built on the standard Xapian Python interface, together with various other add-ons such as performance testing utilities.

1USERS
   

Kifkif is a Tool for finding and removing file and folder duplicates in your path. It shall provide either and extensible API/lib and an End-User GUI (Swing) Application for non programmers. File similitude (defining duplicates) could be based on a large (and extensible) number of criteria.

1USERS
   

Sphinx is a full-text search engine, it's a standalone search engine, meant to provide fast, size-efficient and relevant full-text search functions to other applications. Sphinx was specially designed to integrate well with SQL databases and scripting languages.

1USERS

Pyndexter provides a uniform API for accessing a variety of full-text search and indexing engines. It aims to be to full-text indexing systems what the Python DB API is to databases. It presents a uniform query syntax to the user, with support for ... [More] quoted search terms, boolean operations, sub-expressions and attribute (metadata) querying. Indexers supported are a basic but functional pure-Python indexer, adapters for Hype, Hyperestraier, Lucene, Lupy, Pyndex, Swish-e and Xapian. [Less]

0USERS

OpenFTS (Open Source Full Text Search engine) is an advanced PostgreSQL-based search engine that provides online indexing of data and relevance ranking for database searching. Close integration with database allows use of metadata to restrict search ... [More] results. OpenFTS is based on PostgreSQL's GIST support and it is distributed both as collection of PERL scripts and as collection of TCL scripts. [Less]

0USERS

CS2 stands for C# Code Search. It's an academic project developed for the course of Information Retrieval at Università di Modena e Reggio Emilia, Italy.

0USERS

Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up ... [More] and modify, and is used in thousands of websites across the world. Sphider supports all standard search options, but also includes a plethora of advanced features such as word autocompletion, spelling suggestions etc. The sophisticated adminstration interface makes administering the system easy. [Less]

0USERS

Key features: Support for http, https, ftp, nntp and news URL schemes. htdb virtual URL scheme for indexing SQL databases. Indexes text/html, text/xml, text/plain, audio/mpeg (MP3) and image/gif mime types natively. External parsers support for other ... [More] document types, including Microsoft Word, Excel, RTF, PowerPoint, Adobe Acrobat PDF and Flash. Can index multilingual sites using content negotiation. Searching all of the word forms using ispell affixes and dictionaries. Synonym, acronym and abbreviation query expansion based on editable dictionaries, specified by language and charset. Stop-words, synonyms and acronyms lists. Options to query with all words, all words near to each others, any words, or Boolean queries. A subset of VQL (Verity Query Language) is supported. Popularity Rank based on a neural network model. Results can be sorted by relevancy (using vector calculation), popularity rank as "Goo" (adding weight for incoming links), and "Neo" (neural network model), last modified time, and by "importance" (a combination of relevancy and popularity rank). Supports wide range of character sets support with automated character set and language detection. Offers an accent insensitive search option. Provides phrase segmenting (tokenizing) for Chinese, Japanese, Korean and Thai. Includes an indexer and a web CGI front-end, as well as a search module for Apache web server (mod_dpsearch). Handles Internationalized Domain Names (IDN). Summary Extraction Algorithm automatically sums up each document in several sentences. Uses If-Modified-Since for efficient transfer of only changed files. Can tweak URLs with session IDs and other weird formats, including some JavaScript link decoding. Can perform parallel and multi-threaded indexing for faster updating. Flexible update scheduling, including options for checking some sections of a site more frequently. Handles basic authentication (user name and password) and cookies. Stores a compressed text version of the documents for extracting and viewing. Can specify a default character set and language for a server or subdirectory, or a list of possible languages. Noindex tags: , , , Google's special comments , and consider as tags to include/exclude. Can specify a content body tag. Spellchecking for query words with aspell. Flexible options and commands to customize search result pages. Effective caching gives significant time reduction in search times. Query logging stores the query, query parameters and the number of results found. [Less]

0USERS

Libibase develope a library for Search Engine Index base. hindexd.ini 需要配置hidoc位置也就是hispider的hispider.doc 的路径. 词典可以从http://code.google.com/p/libibase/downloads/list 下载dict.tar.gz ... [More] , 也可以加入自己的词典, 词典格式:一行一个词. 运行运行spider src目录下. /hispiderd -c ../doc/rc.hispiderd.ini && ./hispider -c ../doc/rc.hispider.ini 可通过 http://127.0.0.1:3721/ 查看抓取数据情况 hibased hindexd src 目录下 ./hibased -c ../doc/rc.hibased.ini && ./hindexd -c ../doc/rc.hindexd.ini 可以通过 http://127.0.0.1:8081/ 查询了 有问题可以给我的mail/MSN: sounos@gmail.com [Less]

0USERS

PAIS : Peter-Anna Internet Searcherhttp://pais.krhttp://pais.kr Currently up and running search site. Currently it is just showing web pages that have the given keyword. There is no scheme to show important pages first yet. Indexed only 10189 web ... [More] pages gathered from http://www.google.com by pais.www.DocRobot TIPS : Try to type the following hidden command. show-me-the-stat DescriptionPure Java An Efficient WWW Search Engine Basic IdeaDo not store whole web, but store keywords and urls only to save the amount of storage for WWW search. User should use PAIS client application to search web. PAIS client processes data as much as possible to maximize the throughput of PAIS server. Peter SaysTo my lovely wife, Anna ~! [Less]

0USERS

Namazu is a full-text search engine intended for easy use. Not only does it work as a small or medium scale Web search engine, but also as a personal search system for email or other files.