[31 total ]
A Java library for parsing, validating and manipulating XML documents. The latest version released, 2.9.1, provides support for the following standards and APIs:
* XML 1.0 (4th Edition)
* Namespaces in XML 1.0 (2nd Edition)
* XML 1.1 (2nd
... [More]
Edition)
* Namespaces in XML 1.1 (2nd Edition)
* W3C XML Schema 1.0 (2nd Edition)
* XInclude 1.0 (2nd Edition)
* OASIS XML Catalogs 1.0
* SAX 2.0.2
* DOM Level 3 Core, Load and Save
* DOM Level 2 Core, Events, Traversal and Range
* JAXP 1.3 [Less]
Rome is a set of Atom/RSS Java utilities that make it easy to work in Java with most syndication formats. Today it accepts all flavors of RSS (0.90, 0.91, 0.92, 0.93, 0.94, 1.0 and 2.0) and Atom 0.3 feeds. Rome includes a set of parsers and
... [More]
generators for the various flavors of feeds, as well as converters to convert from one format to another. The parsers can give you back Java objects that are either specific for the format you want to work with, or a generic normalized SyndFeed object that lets you work on with the data without bothering about the underlying format. [Less]
ANother Tool for Language Recognition (ANTLR) is the name of a parser generator that uses LL(k) parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development. Its maintainer is professor Terence Parr of the University of San Francisco.
Java Compiler Compiler is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to
... [More]
the parser generator itself, JavaCC provides other standard capabilities related to parser generation such as tree building (via a tool called JJTree included with JavaCC), actions, debugging, etc. [Less]
Apache PDFBox is an open source Java PDF library for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes
... [More]
several command line utilities.
* PDF to text extraction
* Merge PDF Documents
* PDF Document Encryption/Decryption
* Lucene Search Engine Integration
* Fill in form data FDF and XFDF
* Create a PDF from a text file
* Create image
Apache PDFBox is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consist [Less]
A runtime for VRML and X3D virtual worlds and models.
OpenVRML includes a core runtime library,
parsers for VRML97 and VRML-format X3D, an OpenGL renderer, and a Mozilla Web browser plug-in.
XMLBeans is a technology for accessing XML by binding it to Java types.
XMLBeans provides several ways to get at the XML, including:
* Through XML schema that has been compiled to generate Java types that represent schema types. In this way
... [More]
, you can access instances of the schema through JavaBeans-style accessors after the fashion of "getFoo" and "setFoo". The XMLBeans API also allows you to reflect into the XML schema itself through an XML Schema Object model.
* A cursor model through which you can traverse the full XML infoset.
* Support for XML DOM. [Less]
Woodstox is a high-performance validating namespace-aware StAX-compliant (JSR-173) Open Source XML-processor written in Java.
XML processor means that it handles both input (== parsing) and output (== writing, serialization)), as well as supporting tasks such as validation.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
Apache Tika is a subproject of Apache Lucene.
args4j is a small Java class library that makes it easy to parse command line options/arguments in your CUI application.
Advanced server-side Javascript.
EvaScript is a Javascript implementation based on the ECMA Script standard.
Remote-control Javascript object via Java.
UnitTest you Javascript code.
Browser independent scripting.
Works with almost every browser, including the iPhone.
FlatPack came out of the frustration of having to mix file parsing logic with business logic.
FlatPack on SourceForge: a Java (1.4+) flat file parser that handles CSV, fixed length and custom delimiters. The formats are configured in XML, it is
... [More]
fast and released under Apache license 2.0.
Substrings in a fixed width parse can be daunting to deal with when trying to analyze what existing code is doing, and what about when you have no comments...
We also provide delimited file parsing; works with any delimiter / qualifier, multiline records, delimiter or qualifier allowed in column value. [Less]
Jackson is a Streaming, Light-weight, Fast, Fully-conformant, Open Source JSON processor (parser + generator).
SableCC is an object-oriented framework that generates compilers (and interpreters) in the Java programming language.
Opensource Java library which includes various lightweight XML processing tools.
Major features are:
Generating POJO by DTD;
XML-POJO mapping via Java5 annotations or DTD;
XML manipulations using POJO without SAX/DOM;
Preprocessing of XML
... [More]
documents using expression language;
Binary XML;
RMI friendly XML;
Exporting to JSON;
XML marshall/unmarshall; [Less]
ZFileReader on SourceForge: a Java (1.4+) flat file parser that handles CSV, fixed length and custom delimiters. The formats are configured in XML, it is fast and released under Apache license 2.0.
Substrings in a fixed width parse can be daunting
... [More]
to deal with when trying to analyze what existing code is doing, and what about when you have no comments...
We also provide delimited file parsing; works with any delimiter / qualifier, multiline records, delimiter or qualifier allowed in column value. [Less]
This project provides Java libraries and tools to work with FIXatdl files. FIXatdl stands for FIX Algorithmic Trading Definition Language. Its an effort by the FIX Algorithm Working Group to define a language for algorithms descriptions.
Vaniglia is a Java library composed of a number of lightweight, very specific, and performance oriented java components.
Currently the following components are implemented:
- Command Protocol
- Crypto
- Extensions Framework
-
... [More]
RollingFileDailyFolderAppender for Log4J
- RollingFileFoldersBackupAppender for Log4J
- Parser
- Performance Monitor
- Polling
- Objects Pool
- Socket Communication Framework
- State Machine
- Template Engine
- Text Table
- Time Utilities
- Vaniglia Message Queue [Less]
ShaniXmlParser is a small and fast Xml/Html DOM/SAX non validating parser written in java. It can parse not well formed xml files. It also parses DTD, ENTITIES, CSS.
Use DOM,SAX,JAXP interfaces.
It passes DOM1/2/3 validations suites
Textile-J is a Java library that provides a simple parser for multiple wiki markup languages[1],[2] (Textile, MediaWiki / WikiMedia, Confluence, and TracWiki), an Eclipse editor for editing Textile markup, and a simple JFace text viewer that can be
... [More]
used to display the markup in an SWT or eclipse environment. The Java library may be used standalone or as an Eclipse plugin.
The parser can be used on its own to convert markup to XHTML or DocBook, or the parser can be used with the provided JFace viewer to display the Textile in a UI such as eclipse.
This project has been contributed to Elipse Mylyn as WikiText. Find out more here: http://greensopinion.blogspot.com/2008/08/textile-j-is-moving-to-mylyn-wikitext.html [Less]
Java based generic scripting engine with dynamic language features. Syntax is also based on the Java language and works transparent with Java VM in that it locates classes on the fly and can also compile .java files in runtime.
Recently added prototype functionality (as known from ECMA based scriptlanguages such as Javascript/Actionscript).
A Java XML parser derived from the Sun Project X Parser.
eXtensible Binary Universal Protocol (XBUP) is attempt to create universal platform independent protocol for general usage. It should use best unary-binary encoding and most logical tree structure based on strong arguments.
The initial purpose of this mini project was to parse Google Video's provided spanish subtitles of Randy Pausch's great "Last Lecture", and convert it to other subtitle file formats (if you take a look at Google, many people are looking for it).
... [More]
This can provide a source of open/closed captions, that can be placed or embedded in alternative movie file formats like MP4, 3GP, AVI, or others; or simply used at play time using a video filter like VSFilter or FFDShow.
The project is simply an utility that provides an XML parser for Google Video transcriptions, and provides translators for some of the most common subtitle file formats. [Less]
JewelCli uses an annotated Java interface definition to automatically parse and present command line arguments.
The annotated interface definition is an intuitive means of specifying command line arguments. It also provides clean separation of concerns suitable for test driven development.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.
MarkdownJ is the pure Java port of Markdown (a text-to-html conversion tool written by John Gruber.)
A visual IDE-style LL(k) parser generator that uses an editable tree with icons for terminal and non-terminal symbols to represent the grammar rules.
A Java library to ease creation and manipulation of KML files. A KML file can be loaded as simply as this: Kml kmlRoot = new KMLParser().parse(new File(filename)); Alternatively a Kml document can be built up programmatically, eg by creating new
... [More]
Placemark(), new Folder() etc and adding them to the root, or other Folders.
Kml.toKML() produces the KML document.
KML.toUpdateKML() will produce a KML document representing the updates since toUpdateKML() was last called. [Less]
DescriptionUsing annotations you can make very succinct main methods that don't need to know how to parse command line arguments with either fields, properties, or method based injection.
Very simple to use, very small dependency.
LOC
Example
... [More]
from the Tests public static class TestCommand {
@Argument(value = "input", description = "This is the input file", required = true)
private String inputFilename;
@Argument(value = "output", alias = "o", description = "This is the output file", required = true)
private File outputFile;
@Argument(description = "This flag can optionally be set")
private boolean someflag;
@Argument(description = "Minimum", alias = "m")
private Integer minimum;
@Argument(description = "List of values", delimiter = ":")
public void setValues(Integer[] values) {
this.values = values;
}
public Integer[] getValues() {
return values;
}
private Integer[] values;
@Argument(description = "List of strings", delimiter = ";")
private String[] strings;
@Argument(description = "not required")
private boolean notRequired;
}
public void testArgsParse() {
TestCommand tc = new TestCommand();
Args.usage(tc);
String[] args = {"-input", "inputfile", "-o", "outputfile", "extra1", "-someflag", "extra2", "-m", "10", "-values", "1:2:3", "-strings", "sam;dave;jolly"};
List extra = Args.parse(tc, args);
assertEquals("inputfile", tc.inputFilename);
assertEquals(new File("outputfile"), tc.outputFile);
assertEquals(true, tc.someflag);
assertEquals(10, tc.minimum.intValue());
assertEquals(3, tc.values.length);
assertEquals(2, tc.values[1].intValue());
assertEquals("dave", tc.strings[1]);
assertEquals(2, extra.size());
} [Less]