Archive for May, 2010

GSoC Work

Thursday, May 20th, 2010

I would like to outline of my GSoC goals and objectives:

My Google Summer of Code (2010) objectives are to integrate libffi into the NCI framework and to build a new Stack Frame builder that takes advantage of the llvm.

NCI Framework – The current NCI system has a few limitations which I am going to try to alleviate. I don’t know if I will be able to remove all of the limitations, but I will try to add all of the capabilities of the libffi library to the core of Parrot. This includes being able to define structures as data types for calling functions, adding a few new data types on systems that support them (this would be 64 bit integers on systems that have 64 bit integers, and etc.), as well as adding improvements for calling functions in foreign libraries. The current NCI system in Parrot is not capable of defining a structure for instance, or int64 types. I plan on implementing all of the supported data types in libffi as parts of the modified NCI system. I do plan on retaining the current functionality of the NCI system for people without libffi. I don’t think parrot currently bundles any third party libraries with parrot, but I do know python, for example, bundles libffi with its source code and builds its by default if you don’t make it use your system libffi.

Stack Frame Builder – There are a number of places for integrating the llvm into parrot. One obvious place to start is with the stack frame builder. Translating the stack frame into llvm-ir and running some of the llvm optimization passes over the resulting code could provide parrot with both a JIT system and some speed ups of the generated code. It could also be possible to dump the llvm-ir code to a file, so you end up with a sort of pbc to llvm-ir translation. The resulting llvm-ir can also be compiled into a native binary or dynamic library, which would also be useful because that could cut out some of the overhead for libraries.

The current plans for the stack frame builder are still being mapped out, but for now I am focusing on the NCI system, and I am hoping to have that knocked out relatively quickly so I can move on to the work with the llvm.

nq-nqp: Lessons Learned

Tuesday, May 4th, 2010

NQ-NQP’s Status:

  • A fairly well defined grammar for parsing nqp exists. It generates AST nodes for most of the grammar, but not all of the grammar.
    • In the future, it would be useful for AST optimizations, like constant folding
  • Code generation supports integers and subs, but it does not yet support other data types like floats, strings, bools, or anything else like regular expressions.
    • Built in basic types should eventually include:
      • Packages
      • Subs
      • Classes
      • Roles
      • Arrays
      • Hashes
      • Ints
      • Nums
      • Strings
      • Bools
  • Execution of code generated works.
    • Debugging is surprisingly easy with the gdb, but I will have to see how hard it is in the future. I know that if I start working with a JIT it becomes increasingly difficult to view whats being run on the processor.
  • A limited number of binary ops, like add and subtract are working.
  • The VM supports method calls, but the code does not currently generate code properly.

Lessons learned:

  • LLVM is neat but advanced and complicated
  • Grammars are fun, we should do more of those
    • Having a way of defining a grammar is difficult though currently, in C there is yacc/bison but that is a bit limited compared to other systems.
  • Grammars are hard, parsing is not easy
    • Tokenizers and Lexers are difficult if you want to contextually call things based off of where they should be in your grammar
  • Focusing on construction a real “Compiler” is probably a mistake. Start small, an ast that you can walk and execute each step of the way is better for learning than building a code generation system. Code generation is useful but hard, and can be deferred until later.

Things I would do different if I could start over:

  • Focus on the grammar, and making an executable AST rather than translating the AST into code.
  • If there was a VM out there I could link to for a runtime, I have not found an easy to use one. It would be nice to have because then I could of focused more on other parts of the language, but VM support is not that hard, just lots of typing.