Classifying Programming Languages

What are some of the different ways to categorize programming languages?

CONTENTS

Overview • Machine Code • Assembly Language • High-Level Languages • System Languages • Scripting Languages • Esoteric Languages • Ousterhout’s Dichotomy

Overview

Different languages have different purposes, so it makes sense to talk about different kinds, or types, of languages. Some types are:

Machine languages, that are interpreted directly in hardware
Assembly languages, that are thin wrappers over a corresponding machine language
High-level languages, that are anything machine-independent
System languages, that are designed for writing low-level tasks, like memory and process management
Scripting languages, that are generally extremely high-level and powerful
Domain-specific languages, that are used in highly special-purpose areas only
Visual languages, that are non-text based
Esoteric languages, that are not really intended to be used, but are very interesting, funny, or educational in some way

These types are not mutually exclusive: Perl is both high-level and scripting; C is considered both high-level and system.

Other types people have identified: Toy, Educational, Very High-Level, Compiled, Interpreted, Free-Form, Curly Brace, Applicative, Von Neumann, Expression-Oriented, Persistent, Concurrent, Glue, Intermediate, Quantum, Hybrid. See Wikipedia’s category page on programming language classification.

Machine Code

Most computers work by executing stored programs in a fetch-execute cycle. Machine code generally features:

Registers to store values and intermediate results
Very low-level machine instructions (add, sub, div, sqrt)
Labels and conditional jumps to express control flow
A lack of memory management support — programmers do that themselves

Machine code is usually written in hex. Here’s an example for the Intel 64 architecture:

89 F8 A9 01 00 00 00 75 06 6B C0 03 FF C0 C3 C1 E0 02 83 E8 03 C3

Can you tell what it does?

Assembly Language

An assembly language is an encoding of machine code into something more readable. It assigns human-readable labels (or names) to storage locations, jump targets, and subroutine starting addresses, but doesn’t really go too far beyond that. Here’s the function from above on the Intel 64 architecture using the GAS assembly language:

        .globl  f

        .text

f:

        mov     %edi, %eax      # Put first parameter into eax register

        test    $1, %eax        # Examine least significant bit

        jnz     odd             # If it's not a zero, jump to odd

        imul    $3, %eax        # It's even, so multiply it by 3

        inc     %eax            # and add 1

        ret                     # and return it

odd:

        shl    $2, %eax         # It's odd, so multiply by 4

        sub    $3, %eax         # and subtract 3

        ret                     # and return it

And here’s the same function, written for the SPARC:

        .global f

f:

        andcc   %o0, 1, %g0

        bne     .L1

        sll     %o0, 2, %g2

        sll     %o0, 1, %g2

        add     %g2, %o0, %g2

        b       .L2

        add     %g2, 1, %o0

.L1:

        add     %g2, -3, %o0

.L2:

        retl

        nop

High-Level Languages

A high-level language gets away from all the constraints of a particular machine. HLLs have features such as:

Names for almost everything: variables, types, subroutines, constants, modules
Complex expressions (e.g. 2 * (y^5) >= 88 && sqrt(4.8) / 2 % 3 == 9)
Control structures (conditionals, switches, loops)
Composite types (arrays, structs)
Type declarations
Type checking
Easy, often implicit, ways to manage global, local and heap storage
Subroutines with their own private scope
Abstract data types, modules, packages, classes
Exceptions

The previous example looks like this in Fortran 77 (note how the code begins in column 7 or beyond):

       INTEGER FUNCTION F(N)

       INTEGER N

       IF (MOD(N, 2) .EQ. 0) THEN

           F = 3 * N + 1

       ELSE

           F = 4 * N - 3

       END IF

       RETURN

       END

and like this in Fortran 90 (where the column requirements were finally removed):

integer function f (n)

    implicit none

    integer, intent(in) :: n

    if (mod(n, 2) == 0) then

        f = 3 * n + 1

    else

        f = 4 * n - 3

    end if

end function f

and like this in Ada:

function F (N: Integer) return Integer is

begin

    if N mod 2 = 0 then

        return 3 * N + 1;

    else

        return 4 * N - 3;

    end if;

end F;

and like this in C and C++:

int f(const int n) {

    return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3;

}

and like this in Java and C#:

class ThingThatHoldsTheFunctionUsedInTheExampleOnThisPage {

    public static int f(int n) {

        return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3;

    }

}

and like this in Scala:

def f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3;

and like this in Kotlin:

fun f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3

and like this in JavaScript:

function f(n) {

  return (n % 2 === 0) ? 3 * n + 1 : 4 * n - 3;

}

and like this in CoffeeScript:

f = (n) -> if n % 2 == 0 then 3 * n - 1 else 4 * n + 3

and like this in Smalltalk:

f

  ^self % 2 = 0 ifTrue:[3 * self + 1] ifFalse:[4 * self - 3]

and like this in Standard ML:

fun f n = if n mod 2 = 0 then 3 * n + 1 else 4 * n - 3

and like this in Elm:

f n = if n % 2 == 0 then 3 * n + 1 else 4 * n - 3

and like this in Haskell (thanks @kaftoot):

f n | even(n) = 3 * n + 1  | otherwise = 4 * n - 3

and like this in Julia (yes, 3n is “three times n”):

f(n) = iseven(n) ? 3n+1 : 4n-3

and like this in Lisp:

(defun f (n)

  (if (= (mod n 2) 0)

    (+ (* 3 n) 1)

    (- (* 4 n) 3)))

and like this in Clojure:

(defn f [n]

  (if (= (mod n 2) 0)

    (+ (* 3 n) 1)

    (- (* 4 n) 3)))

and like this in Prolog:

f(N, X) :- 0 is mod(N, 2), X is 3 * N + 1.

f(N, X) :- 1 is mod(N, 2), X is 4 * N - 3.

and like this in Erlang:

f(N) when (N band 1) == 0 -> 3 * N + 1;

f(N) -> 4 * N - 3.

and like this in Perl:

sub f {

    my $n = shift;

    $n % 2 == 0 ? 3 * $n + 1 : 4 * $n - 3;

}

and like this in Python:

def f(n):

    return 3 * n + 1 if n % 2 == 0 else 4 * n - 3

and like this in Ruby:

def f(n)

  n % 2 == 0 ? 3 * n + 1 : 4 * n - 3;

end

and like this in Go:

func f(n int) int {

    if n % 2 == 0 {

        return 3 * n + 1

    } else {

        return 4 * n - 3

    }

}

and like this in Rust:

fn f(n: int) -> int {

    return if n % 2 == 0 {3 * n + 1} else {4 * n - 3}

}

and like this in Swift:

func f(n: Int) -> Int {

    return n % 2 == 0 ? 3 * n + 1 : 4 * n - 3

}

and like this in K:

f:{:[x!2;(4*x)-3;1+3*x]}

Exercise: Which of these languages required that variables or functions be declared with types and which did not?

Exercise: Implement this function in PHP, Objective C, Ceylon, D, and Mercury.

System Languages

System programming languages differ from application programming languages in that they are more concerned with managing a computer system rather than solving general problems in health care, game playing, or finance. In a system langauge, the programmer, not the runtime system, is generally responsible for:

Memory management
Process management
Data transfer
Caches
Device drivers
Directly interfacing with the operating system

Scripting Languages

Scripting languages are used for wiring together systems and applications at a very high level. They are almost always extremely expressive (they do a lot with very little code) and usually dynamic (meaning the compiler does very little, while the run-time system does almost everything).

Esoteric Languages

An esoteric language is one not intended to be taken seriously. They can be jokes, near-minimalistic, or despotic (purposely obfuscated or non-deterministic).

See Wikipedia’s article on esoteric languages.

Exercise: Implement the function above in False, Brainfuck, Befunge, Malbolge, Kipple, and reMorse.

Ousterhout’s Dichotomy

John Ousterhout once claimed that programming languages roughly fall into two types, which he called scripting and system languages. You can read about this idea at Wikipedia. Then read this two-part article (Part 1, Part 2) on the dichotomy and on languages that seem to reject it.

Search This Blog

good informations