Build Your Own Programming Language A Developer's Comprehensive Guide to Crafting, Compiling, and Implementing Programming Languages
There are many reasons to build a programming language: out of necessity, as a learning exercise, or just for fun. Whatever your reasons, this book gives you the tools to succeed. You’ll build the frontend of a compiler for your language and generate a lexical analyzer and parser using Lex and YACC...
Other Authors: | , |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Birmingham, England :
Packt Publishing
[2024]
|
Edition: | Second edition |
Series: | Routledge revivals.
|
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009803405606719 |
Table of Contents:
- Cover
- Copyright
- Foreword
- Contributors
- Table of Contents
- Preface
- Section 1: Programming Language Frontends
- Chapter 1: Why Build Another Programming Language?
- Motivations for writing your own programming language
- Types of programming language implementations
- Organizing a bytecode language implementation
- Languages used in the examples
- The difference between programming languages and libraries
- Applicability to other software engineering tasks
- Establishing the requirements for your language
- Case study - requirements that inspired the Unicon language
- Unicon requirement #1 - preserve what people love about Icon
- Unicon requirement #2 - support large-scale programs working on big data
- Unicon requirement #3 - high-level input/output for modern applications
- Unicon requirement #4 - provide universally implementable system interfaces
- Summary
- Questions
- Chapter 2: Programming Language Design
- Determining the kinds of words and punctuation to provide in your language
- Specifying the control flow
- Deciding on what kinds of data to support
- Atomic types
- Composite types
- Domain-specific types
- Overall program structure
- Completing the Jzero language definition
- Case study - designing graphics facilities in Unicon
- Language support for 2D graphics
- Adding support for 3D graphics
- Summary
- Questions
- Chapter 3: Scanning Source Code
- Technical requirements
- Lexemes, lexical categories, and tokens
- Regular expressions
- Regular expression rules
- Regular expression examples
- Using UFlex and JFlex
- Header section
- Regular expressions section
- Writing a simple source code scanner
- Running your scanner
- Tokens and lexical attributes
- Expanding our example to construct tokens
- Writing a scanner for Jzero
- The Jzero flex specification
- Unicon Jzero code.
- Java Jzero code
- Running the Jzero scanner
- Regular expressions are not always enough
- Summary
- Questions
- Chapter 4: Parsing
- Technical requirements
- Syntax analysis
- Context-free grammars
- Writing context-free grammar rules
- Writing rules for programming constructs
- Using iyacc and BYACC/J
- Declaring symbols in the header section
- Advanced yacc declarations
- Putting together the yacc context-free grammar section
- Understanding yacc parsers
- Fixing conflicts in yacc parsers
- Syntax error recovery
- Putting together a toy example
- Writing a parser for Jzero
- The Jzero lex specification
- The Jzero yacc specification
- Unicon Jzero code
- Java Jzero parser code
- Running the Jzero parser
- Improving syntax error messages
- Adding detail to Unicon syntax error messages
- Adding detail to Java syntax error messages
- Using Merr to generate better syntax error messages
- Summary
- Questions
- Chapter 5: Syntax Trees
- Technical requirements
- Using GNU Make
- Learning about trees
- Defining a syntax tree type
- Parse trees versus syntax trees
- Creating leaves from terminal symbols
- Wrapping tokens in leaves
- Working with YACC's value stack
- Wrapping leaves for the parser's value stack
- Determining which leaves you need
- Building internal nodes from production rules
- Accessing tree nodes on the value stack
- Using the tree node factory method
- Forming syntax trees for the Jzero language
- Debugging and testing your syntax tree
- Avoiding common syntax tree bugs
- Printing your tree in a text format
- Printing your tree using dot
- Summary
- Questions
- Section 2: Syntax Tree Traversals
- Chapter 6: Symbol Tables
- Technical requirements
- Establishing the groundwork for symbol tables
- Declarations and scopes
- Assigning and dereferencing variables.
- Choosing the right tree traversal for the job
- Creating and populating symbol tables for each scope
- Adding semantic attributes to syntax trees
- Defining classes for symbol tables and symbol table entries
- Creating symbol tables
- Populating symbol tables
- Synthesizing the isConst attribute
- Checking for undeclared variables
- Identifying the bodies of methods
- Spotting uses of variables within method bodies
- Finding redeclared variables
- Inserting symbols into the symbol table
- Reporting semantic errors
- Handling package and class scopes in Unicon
- Mangling names
- Inserting self for member variable references
- Inserting self as the first parameter in method calls
- Testing and debugging symbol tables
- Summary
- Questions
- Chapter 7: Checking Base Types
- Technical requirements
- Type representation in the compiler
- Defining a base class for representing types
- Subclassing the base class for complex types
- Assigning type information to declared variables
- Synthesizing types from reserved words
- Inheriting types into a list of variables
- Determining the type at each syntax tree node
- Determining the type at the leaves
- Calculating and checking the types at internal nodes
- Runtime type checks and type inference in Unicon
- Summary
- Questions
- Chapter 8: Checking Types on Arrays, Method Calls, and Structure Accesses
- Technical requirements
- Checking operations on array types
- Handling array variable declarations
- Checking types during array creation
- Checking types during array accesses
- Checking method calls
- Calculating the parameters and return type information
- Checking the types at each method call site
- Checking the type at return statements
- Checking structured type accesses
- Handling instance variable declarations
- Checking types at instance creation.
- Checking types of instance accesses
- Summary
- Questions
- Chapter 9: Intermediate Code Generation
- Technical requirements
- What is intermediate code?
- Why generate intermediate code?
- Learning about the memory regions in the generated program
- Introducing data types for intermediate code
- Adding the intermediate code attributes to the tree
- Generating labels and temporary variables
- An intermediate code instruction set
- Instructions
- Declarations
- Annotating syntax trees with labels for control flow
- Generating code for expressions
- Generating code for control flow
- Generating label targets for condition expressions
- Generating code for loops
- Generating intermediate code for method calls
- Reviewing the generated intermediate code
- Summary
- Questions
- Chapter 10: Syntax Coloring in an IDE
- Writing your own IDE versus supporting an existing one
- Downloading the software used in this chapter
- Adding support for your language to Visual Studio Code
- Configuring Visual Studio Code to do Syntax Highlighting for Jzero
- Visual Studio Code extensions using the JSON format
- JSON atomic types
- JSON collections
- File organization for Visual Studio Code extensions
- The extensions file
- The extension manifest
- Writing IDE tokenization rules using TextMate grammars
- Integrating a compiler into a programmer's editor
- Analyzing source code from within the IDE
- Sending compiler output to the IDE
- Avoiding reparsing the entire file on every change
- Using lexical information to colorize tokens
- Extending the EditableTextList component to support color
- Coloring individual tokens as they are drawn
- Highlighting errors using parse results
- Summary
- Questions
- Section 3: Code Generation and Runtime Systems
- Chapter 11: Preprocessors and Transpilers
- Understanding preprocessors.
- A preprocessing example
- Identity preprocessors and pretty printers
- The preprocessor within the Unicon preprocessor
- Code generation in the Unicon preprocessor
- Transforming objects into classes
- Generating source code from the syntax tree
- Closure-based inheritance in Unicon
- The difference between preprocessors and transpilers
- Transpiling Jzero code to Unicon
- Semantic attributes for transpiling to Unicon
- A code generation model for Jzero
- The Jzero to Unicon transpiler code generation method
- Transpiling the base cases: names and literals
- Handling the dot operator
- Mapping Java expressions to Unicon
- Transpiler code for method calls
- Assignments
- Transpiler code for control structures
- Transpiling Jzero declarations
- Transpiling Jzero block statements
- Transpiling a Jzero class into a Unicon package that contains a class
- Summary
- Questions
- Chapter 12: Bytecode Interpreters
- Technical requirements
- Understanding what bytecode is
- Comparing bytecode with intermediate code
- Building a bytecode instruction set for Jzero
- Defining the Jzero bytecode file format
- Understanding the basics of stack machine operation
- Implementing a bytecode interpreter
- Loading bytecode into memory
- Initializing the interpreter state
- Fetching instructions and advancing the instruction pointer
- Instruction decoding
- Executing instructions
- Starting up the Jzero interpreter
- Writing a runtime system for Jzero
- Running a Jzero program
- Examining iconx, the Unicon bytecode interpreter
- Understanding goal-directed bytecode
- Leaving type information in at runtime
- Fetching, decoding, and executing instructions
- Crafting the rest of the runtime system
- Summary
- Questions
- Chapter 13: Generating Bytecode
- Technical requirements
- Converting intermediate code to Jzero bytecode.
- Adding a class for bytecode instructions.