Petal compiler, specification, and tools

designing.md 2.3KB

How do you design a programming language?

Problem domains

Principles and theoretical backing

Programming paradigms

Inspiration languages

Look and feel

  • Curly brace: Java, C, Javascript
    • Blocks are bookended by {}
    • Semicolons as statement enders / delimiters common
    • Also tends to be high on parentheses
  • Begin/end: Ruby and Lua
    • Blocks start implicitly or with begin, end with end
    • Statement delimiters rare
  • Indentation: Python, CoffeeScript
    • Needs consistent indentation
    • Sometimes starts blocks with :
  • Single expression: Elm, Haskell
    • Usually function-oriented
    • Preamble of helper declarations followed by one expression as the return value
    • Pattern matching as major flow control mechanism
    • Mainly only functional languages
  • Parentheses: Lisp
    • Everything uses parentheses.

Delimiters

Delimiters make things easier on you.

Let’s say you’re trying to make a low-delimiter language. You don’t want parentheses for function calls. You don’t want statement delimiters. You don’t want commas separating function parameters.

This means you write code like:

type1 var1
func1 var1
type2 var2

The names make it clear what’s happening. But without those names, you just have:

a b
c b
d e

How do you parse that? Is it supposed to be one function call with five arguments? Three declarations? A mix of calls and declarations?

If you add a statement delimiter, you might encounter:

a b;
c b
d e;

So the first is either a function call or a declaration, and the second is a function call with three arguments.

Or let’s say you’re using Dart-style lists. What does this mean?

print myList [12]

Is it printing myList and then [12]? Or is it printing element 12 of myList?

Adding a required delimiter disambiguates:

# prints two lists
print myList; [12]
# just prints one element of myList
print myList [12]

Otherwise, you have to be very careful to make your language unambiguous. For instance, you might use a different syntax for indexing:

var myList = [1, 2, 3]
# Unambiguous indexing
print myList@1
# Weird spacing, but two lists
print myList[1]

Personal value features

What bothers you about existing languages? Fix that.