Browsing projects by Tag(s)

Select a tag to browse associated projects and drill deeper into the tag cloud.

Showing page 1 of 3

An open source full-text search engine and crawler based on best open source technologies: lucene, zkoss, tomcat, poi, pdfbox. Multilingual lemmatization, spellcheck, stop words, synonyms, facet, filters, web crawler, database crawler, local and remote file system crawler, documents indexation ... [More] with OCR, REST with XML or JSON and SOAP API. A stable, high-performance piece of software. It is a modern search engine and a suite of high-powered full text search algorithms. [Less]

5.0
 
  0 reviews  |  3 users  |  69,029 lines of code  |  2 current contributors  |  Analyzed 7 days ago
 
 

If you are using Djapian please tell us about your project in reply to this post Use this package to allow full-text search in your Django project. Versions compatibility matrix: Djapian Django Xapian and python bindings = 2.31.11.0.7 Notice: there is an old issue with Xapian (< ... [More] 1.0.13) in mod_python environment. So be careful. Notice: with 2.2.2 release has been introduced database schema backward-incompatible bug fix - Change model has switched its object_id field type from integer to string. FeaturesMost of this features provided by Xapian itself and Djapian in this case plays role only as Django-compatible adaptation. High-level DSL for indexer declaration Result filtering with Django ORM like API Result set compatible with standard Django Paginator Indexing of field, method results and related model attributes Entry filtering before indexing (by trigger function) Results filtering with boolean lookups support Term tagging Spelling corrections Stemming Result ordering by fields Indexers auto discovery Index shell Model changes auto tracking Support for different index spaces Usage exampleAssume that we have this models in our imaginary application: class Person(models.Model): name = models.CharField(max_length=150) def __unicode__(self): return self.name class Entry(models.Model): author = models.ForeignKey(Person, related_name="entries") title = models.CharField(max_length=250) created_on = models.DateTimeField(default=datetime.now) is_active = models.BooleanField(default=True) text = models.TextField() editors = models.ManyToManyField(Person, related_name="edited_entries") def headline(self): return "%s - %s" % (self.author, self.title) def __unicode__(self): return self.titleAnd we want to apply indexing functionality for model Entry. The next step is to create Indexer instance with proper settings. Indexer may look like this: import djapian class EntryIndexer(djapian.Indexer): fields=["text"] tags=[ ("author", "author.name" ), ("title", "title", 3), ("date", "created_on" ), ("active", "is_active" ), ("editors", "editors" ) ] trigger=lambda indexer, obj: obj.is_active djapian.space.add_index(Entry, EntryIndexer, attach_as="indexer")In the django shell create some instances of models: >>> p = Person.objects.create(name="Alex") >>> Entry.objects.create(author=p, title="Test entry", text="Not large text field") >>> Entry.objects.create(author=p, title="Another test entry", is_active=False) >>> Entry.objects.create(author=p, title="Third small entry", text="Some another text") >>> Entry.indexer.update()Thats all! Each Entry instance has been indexed and now ready for search. Let's try: >>> result = Entry.indexer.search('title:entry') >>> len(result), result.count() 2, 2 >>> for row in result: ... row.percent, row.instance.headline() ... 99 Alex - Test entry 98 Alex - Third small entryYou can follow complete Tutorial for study Djapian basics. [Less]

5.0
 
  0 reviews  |  2 users  |  1,963 lines of code  |  1 current contributor  |  Analyzed 4 days ago
 
 

Gisgraphy is a free and open source framework. Its goal is to provide tools to use free GIS Data on the Web. Actually it manage Geonames and OpenStreetMap (34 million entries). it provides an importer to inject the data into a strongly typed Postgres / Postgis database and use them via webservices : ... [More] worldwide geocoding, worldwide reverse geocoding, fulltext and find nearby.Results can be output in XML, Atom, RSS, JSON, PHP, Ruby, and Python. Here are the main functionalities : Importers for Openstreetmap data in csv format Importers from geonames CSV files, reporting of inconsistencies and a report when import is complete. Just give the country(ies) you wish to import and / or the placetypes, and Gisgraphy download the files and import them with all the alternateNames (optional), and sync the database with the fulltext search engine REST Web Services Full text search (based on Lucene / Solr with default filters optimized for city search (case insensitivity, separator characters stripping, ..) via an Java API or a webservice An admin interface Fully replicated / scalable / high performance / cached services Findnearby function (with limits, pagination, restrict to a specific country and/or language and other useful options) via a Java API or a Web Service Search for zipcode or name Several output formats supported : XML, json, php, ruby, python, ATOM, GEORSS... Dojo widgets / prototype / Ajax to ease search but can be use it even if javascript is not enabled on the client side Plateform / language independent because of webservices Provides all the countries flags in svg and png format more... Note that Gisgraphy is under the LGPL license V3 and the Geonames data are under the creative commons attributions license [Less]

5.0
 
  0 reviews  |  1 user  |  339,949 lines of code  |  0 current contributors  |  Analyzed 6 days ago
 
 

Данная система полнотекстового поиска основана на открытых библиотеках (POI, PDFBox и т.д.) и является плодом творения хобби создателя данного программного компекса. ... [More] Базовое название системы - ftspc. Система является бесплатной и предоставляется бесплатно и по принципу AS IS без всяких гарантий. Автор не берёт на себя ответственность за использование Вами данной системы - Вы используете её на свой страх и риск. Автор так же не гарантирует своевременной технической поддержки. [Less]

5.0
 
  0 reviews  |  1 user  |  10,584 lines of code  |  0 current contributors  |  Analyzed 5 days ago
 
 

php_solr is a lightweight php client library for the Lucene-based enterprise search server Apache Solr.

0
 
  0 reviews  |  1 user  |  1,775 lines of code  |  0 current contributors  |  Analyzed 9 days ago
 
 

Marjory is a webservice for indexing and searching for documents, utilizing a full-text search engine. It is somewhat similar to Solr, but is written in PHP and the underlying architecture allows for using search engines other than Lucene (no other adaptor is implemented yet, though). Marjory is ... [More] based on the Zend Framework and uses Zend_Search_Lucene as the default search engine. Initial development was sponsored by Jimdo, a company offering web-based tools to create free websites. [Less]

5.0
 
  0 reviews  |  1 user  |  1,185 lines of code  |  0 current contributors  |  Analyzed 6 days ago
 
 

A pyhon based desktop search engine. Running on windows. Current svn updated to 0.19.

0
 
  0 reviews  |  0 users  |  2,163 lines of code  |  0 current contributors  |  Analyzed 1 day ago
 
 

This project aims to provide a series of utilities for indexing Subversion repositories for the purpose of providing full-text searching.

0
 
  0 reviews  |  0 users  |  179 lines of code  |  0 current contributors  |  Analyzed 7 days ago
 
 

I am curious about Google's new code hosting service and using this project to explore it. Includes an engine for crawling a subversion repository and full-text indexing the content with Lucene, along with a sample web application for searching the index. Could be a great deal better. I have ... [More] spent most of my energy on the crawling/indexing pipeline as my target repository is very very large. Possibilities for improvement: Support for different source-code repositories. Syntax-highlighting display of source. Pluggable parsers to index source as an AST. etc. [Less]

0
 
  0 reviews  |  0 users  |  3,163 lines of code  |  0 current contributors  |  Analyzed almost 2 years ago
 
 

uffts (ultra-fast fulltext search) will first index all text files (and eventually doc, opendocument, etc.), and allow you to perform near-instant searches within all of them. A special tree designed around ascii text, which results in compression for large amounts of data, should make it possible ... [More] to instantly search gigabytes of source code, all indexed in memory. [Less]

0
 
  0 reviews  |  0 users  |  837 lines of code  |  0 current contributors  |  Analyzed 4 days ago
 
 
 
 

Creative Commons License Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.