Projects tagged ‘c’ and ‘lexer’


[10 total ]

9 Users
   

re2c is a tool for writing very fast and very flexible scanners. Unlike any other such tool, re2c focuses on generating high efficient code for regular expression matching. As a result this allows a ... [More] much broader range of use than any traditional lexer offers. And Last but not least re2c generates warning free code that is equal to hand-written code in terms of size, speed and quality. [Less]
Created over 2 years ago.

1 Users
 

Project Lestes is an ongoing effort to make a C++ compiler with comprehensive C++ source code using the most advanced algorithms seen. Other goals include: easilly retargettable compiler, compiler ... [More] that can be successfully used in teaching compiler construction, compiler that can compile multitude of languages. [Less]
Created about 1 year ago.

1 Users

AST builder for Objective-C++ language. Features: High-performance hand-crafted lexer About 5 times faster than NRefactory's C# lexer Easy to use parser Able to parse complex constructs like int ... [More] (*(int*foo())() )() Able to parse Objective-C classes Easy to use AST tree Able to produce source from AST Open Source, Free Software NOTE: Source files must be previously preprocessed Useful and bugfixing patches are welcomed! Sources currently resides in NObjective so you should checkout it from: svn checkout http://objcmapper.googlecode.com/svn/trunk/NObjectiveAST/ NObjectiveASTUsage// use lexer to produce lexems // than use parser to produce AST var translationUnit = new Parser( new Lexer( File.ReadAllText( "..\\..\\test.cpp" ) ) ).TranslationUnit; [Less]
Created 11 months ago.

0 Users

This is a compiler for a small part of the C language. I'm trying to get it to compile to very simple intermediate language, while adding small features step by step, then when I have enough features ... [More] I'll make the real assembly generation. (Instead of aiming too high and having a perfect parser and say, no code generation) Update: June 16, 2009During the past days I've been heavily working on the IML generation. I've implemented a conservative register allocation algorithm. The IML generation is very limited though. Still do not have any generation to call functions. Notice that the IML is still fairly high level ( accessing local variables by name rather than stack offset ). Input:int main(int testParameter) { int a = 3; int b = 4; int c = a + b; }Output:Globals: Text: main: push ebp mov esp, ebp sub 12, esp mov 3, r0 mov 4, r1 mov r1, b add r0, r1 mov r0, a mov r1, c mov ebp, esp pop ebp ret [Less]
Created 4 months ago.

0 Users

A C preprocessor is a part of a C compiler responsible for macro replacement, conditional compilation and inclusion of header files. It is often found as a stand-alone program on Unix systems. ucpp ... [More] is such a preprocessor; it is designed to be quick and light, but anyway fully compliant to the ISO standard 9899:1999, also known as C99. ucpp can be compiled as a stand-alone program, or linked to some other code; in the latter case, ucpp will output tokens, one at a time, on demand, as an integrated lexer. ucpp operates in two modes: -- lexer mode: ucpp is linked to some other code and outputs a stream of tokens (each call to the lex() function will yield one token) -- non-lexer mode: ucpp preprocesses text and outputs the resulting text to a file descriptor; if linked to some other code, the cpp() function must be called repeatedly, otherwise ucpp is a stand-alone binary. ucpp was written by Thomas Pornin. It is being maintained here by Louis P. Santillan, starting with a copy of version 1.3. [Less]
Created 12 months ago.

0 Users

The following is a Compiler for a completely artificial c-like language.
Created 4 months ago.

0 Users

A parser builder for converting text files to XML.
Created about 1 year ago.

0 Users

Fast lexer for programming languges.Based on handle writen table automation.May be used for tokenize any text with byte and multibyte(constant size) coding. Main optimization is tree of table ... [More] automation with 256 entries.Each entry is code of current char in text. Second optimization will be added in nearest future.It must decrease using of memory.Each byte separate to two parts in 4 bits.For each part create table with 16 entries. //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// Быстрый lexer для языков программирования таких как C/C++/C#/Java и т.п..Основан на написанном вручную табличном автомате.Может быть использован для токенизации текста в одно и многобайтовых(с постоянным размером) кодировках. Главня оптимизация - дерево табличных автоматов с 256 входами.Каждый вход - код текущего символа в тексте. Вторая оптимизация будет добавленна в ближайшем будующем.Она должна существенно снизить использование памяти и состоит в следующем.Не секрет) что байт состоит из восьми битов.Разделим определение символа на две части.Сначала распознаем верхние 4 бита.Для этого создадим таблицу из 16 входов.Потом нижние 4 бита.Соответственно создадим ещё одну такуюже таблицу. [Less]
Created 12 months ago.

0 Users

Scanner generator written in c#
Created 12 months ago.

0 Users

News09 Sep 2008: New source tarball and Windows pre-compiled executables. Checked with valgrind for memory leak. Tests are still to be fixed! 25 Aug 2008: Back to the NFA approach! Also, now every ... [More] modifier is greedy. The expressions like ".+z" or "\d+3" will never succeed as the first term of the expression will eat all the available characters. Current version under SVN supports: Captures (up to 8) Back reference to captures (e.g. "{a+)b\1" matches aba, aabaa, etc ) Negated expressions that match 0 characters of the input. 07 Feb 2008: Stuck on an issue on the determinization step. The algorithm described in the PDF and implemented in the code succeedn on (a*)(a)a but fails on ([ab]*)(a)([ab]*). I devised a different algorithm, closely following the standard subset construction, that succeeds on ([ab]*)(a)([ab]*) but, alas, fails on (a*)(a)a ! I guess I have to take a step back an rethink the whole thing. For those interested, the problem is to determine when a set of states needs to be merged again, even if they've been already, because of a different tags assignment. Any suggestion is greatly welcome! 04 Feb 2008: New tarball and Windows binary. This version produces C code! Use "sh mktest 'expr1' 'expr2' ..." to create a test program named x (or x.exe). Requires gcc. 31 Jan 2008: New tarball and Windows binary. Able to generate ASM-like code. Also performs some optimization. 25 Jan 2008: New tarball and Windows binary. 21 Jan 2008: Discussion group and SVN commits group created 20 Jan 2008: Posted a request for comment about the YRX approach on comp.compilers http://compilers.iecc.com/comparch/article/08-01-051 About YRXYRX is a tool to ease the creation of lexical scanners similar to re2C. It's in its early stage, any comment or feedback is appreciated. [Less]
Created 12 months ago.