Help

This is the tenth installment in a series of articles introducing the Ceylon language. Note that some features of the language may change before the final release.

An overview of the language module

The module ceylon.language contains classes and interfaces that are referred to in the language specification, other declarations they refer to, and a number of related declarations. Let's meet the main characters.

Just like Java, Ceylon has a class named Object.

shared abstract class Object() 
        extends Void() {
        
    doc "A developer-friendly string representing the instance."
    shared formal String string;
    
    doc "Determine if this object belongs to the given Category
         or is produced by the iterator of the given Iterable
         object."
    shared Boolean element(Category|Iterable<Equality> category) {
        switch (category)
        case (is Category) {
            return category.contains(this);
        }
        case (is Iterable<Equality>) {
            if (is Equality self = this) {
                for (Equality x in category) {
                    if (x==self) {
                        return true;
                    }
                }
                fail {
                    return false;
                }
            }
            else {
                return false;
            }
        }
    }
}

In Ceylon, Object isn't the root of the type system. An expression of type Object has a definite, well-defined, non-null value. As we've seen, the Ceylon type system can also represent some more exotic types, for example Nothing, which is the type of null.

Therefore, Ceylon's Object has a superclass, named Void, which we already met in Part 1. All Ceylon types are assignable to Void. Expressions of type Void aren't useful for very much, since Void has no members or operations. You can't even narrow an expression of type Void to a different type. The one useful thing you can do with Void is use it to represent the signature of a method when you don't care about the return type, since a method declared void is considered to have return type Void, as we saw in Part 8.

As we also saw in Part 1, the type Nothing directly extends Void. All types that represent well-defined values extend Object, including:

  • user-written classes,
  • all interfaces, and
  • the types that are considered primitive in Java, such as Integer, Float and Character.

Since an expression of type Object always evaluates to a definite, well-defined value, it's possible to obtain the runtime type of an Object, or narrow an expression of type Object to a more specific type.

Equality and identity

On the other hand, since Object is a supertype of types like Float which are passed by value at the level of the Java Virtual Machine, you can't use the === operator to test the identity of two values of type Object. Instead, there is a subclass of Object, named IdentifiableObject, which represents a type which is always passed by reference. The === operator accepts expressions of type IdentifiableObject. It's possible for a user-written class to directly extend Object, but most of the classes you write will be subclasses of IdentifiableObject. All classes with variable attributes must extend IdentifiableObject.

shared abstract class IdentifiableObject() 
        extends Object() 
        satisfies Equality {

    shared default actual Boolean equals(Equality that) {
        if (is IdentifiableObject that) {
            return this===that;
        }
        else {
            return false;
        }
    }
    
    shared default actual Integer hash {
        return identityHash(this);
    }
    
    shared default actual String string {
        ...
    }
        
}

IdentifiableObject defines a default implementation of the interface Equality, which is very similar to the equals() and hashCode() methods defined by java.lang.Object.

shared interface Equality {
    
    shared formal Boolean equals(Equality that);
    
    shared formal Integer hash;
    
}

Just like in Java, you can refine this default implementation in your own classes. This is the normal way to get a customized behavior for the == operator, the only constraint being, that for subtypes of IdentifiableObject, x===y should imply x==y — equality should be consistent with identity.

Occasionally that's not what we want. For example, for numeric types, I don't care whether a value is of class Natural, Integer, or Whole when comparing it to 0. Fortunately, numeric types extend Object directly, and are not subject to the additional constraints defined by IdentifiableObject.

Thus, Ceylon is able to capture within the type system much of the behavior that Java introduces by fiat special-case rules in the language definition.

Operator polymorphism

Ceylon discourages the creation of intriguing executable ASCII art. Therefore, true operator overloading is not supported by the language. Instead, almost every operator (every one except the primitive ., (), is, and := operators) is considered a shortcut way of writing some more complex expression involving other operators and ordinary method calls. For example, the < operator is defined in terms of the interface Comparable<Other>, which we met in Part 5, and which has a method named smallerThan(), which is in turn defined in terms of another method named compare().

x<y

means, by definition,

x.smallerThan(y)

The equality operator == is defined in terms of the interface Equality, which has a method named equals().

x==y

means, by definition,

x.equals(y)

Therefore, it's easy to customize operators like < and == with specific behavior for our own classes, just by implementing or refining methods like compare() and equals(). Thus, we say that operators are polymorphic in Ceylon.

Apart from Comparable and Equality, which provide the underlying definition of comparison and equality operators, the following interfaces are also important in the definition of Ceylon's polymorphic operators:

  • Summable supports the infix + operator,
  • Invertable supports the prefix + and - operators,
  • Numeric supports the other basic arithmetic operators,
  • Slots supports bitwise operators,
  • Comparable supports the comparison operators,
  • Correspondence and Sequence support indexing and subrange operators, and
  • Boolean is the basis of the logical operators.

Operator polymorphism is a little more flexible than you might imagine. Here's a quick example of this.

The Slots interface

The interface Slots is an abstraction of the idea of a set of slots which may each hold true or false. The bitwise operators &, |, and ~ are defined in terms of this interface. The most obvious subtype of Slots would be a Byte class, where the slots are the eight binary digits.

But the interface Set from the collections module also extends Slots. The slots of a Set are values which may or may not belong to the set. A slot holds true if the value it represents belongs to the Set. The practical value of this is to allow the use of the operator | for set union, the operator & for set intersection, and the infix ~ operator for set complement.

Set<Person> children = males|females ~ adults;

Yes, I realize that these aren't the traditional symbols representing these operations. But if you think carefully about the definition of these operations, I'm pretty sure you'll agree that these symbols are reasonable.

We could even define a Permission class that implements Slots, allowing us to write things like permissions&(read|execute).

Numeric types

As we've mentioned several times before, Ceylon doesn't have anything like Java's primitive types. The types that represent numeric values are just ordinary classes. Ceylon has fewer built-in numeric types than other C-like languages:

  • Natural represents the unsigned integers and zero,
  • Integer represents signed integers,
  • Float represents floating point approximations to the real numbers,
  • Whole represents arbitrary-precision signed integers, and
  • Decimal represents arbitrary-precision and arbitrary-scale decimals.

Natural, Integer and Float have 64-bit precision by default. Eventually, you'll be able to specify that a value has 32-bit precision by annotating it small. But note that this annotation is really just a hint that the compiler is free to ignore (and it currently does).

Numeric literals

There are only two kinds of numeric literals: literals for Naturals, and literals for Floats:

Natural one = 1;
Float oneHundredth = 0.01;
Float oneMillion = 1.0E+6;

The digits of a numeric literal may be grouped using underscores. If the digits are grouped, then groups must contain exactly three digits.

Natural twoMillionAndOne = 2_000_001;
Float pi = 3.141_592_654;

A very large or small numeric literals may be qualified by one of the standard SI unit prefixes: m, u, n, p, f, k, M, G, T, P.

Float red = 390.0n; //n (nano) means E-9
Float galaxyDiameter = 900.0P; //P (peta) means E15
Float hydrogenRadius = 25.0p; //p (pico) means E-12
Float usGovDebt = 14.33T; //T (tera) means E12
Float brainCellSize = 4.0u; //u (micro) means E-6
Natural deathsUnderCommunism = 94M; //M (mega) means E6

Numeric widening

I mentioned earlier that Ceylon doesn't have implicit type conversions, not even built-in conversions for numeric types. Assignment does not automatically widen (or narrow) numeric values. Instead, we need to call one of the operations (well, attributes, actually) defined by the interface Number.

Whole zero = 0.whole; //explicitly widen from Natural
Decimal half = 0.5.decimal; //explicitly widen from Float

Usefully, the unary prefix operators + and - always widen Natural to Integer:

Integer negativeOne = -1;
Integer three = +3;

You can use all the operators you're used to from other C-style languages with the numeric types. You can also use the ** operator to raise a number to a power:

Float diagonal = (length**2.0+width**2.0)**0.5;

Of course, if you want to use the increment ++ operator, decrement -- operator, or one of the compound assignment operators such as +=, you'll have to declare the value variable.

Since it's quite noisy to explicitly perform numeric widening in numeric expressions, the numeric operators automatically widen their operands, so we could write the expression above like this:

Float diagonal = (length**2+width**2)**(1.0/2);

The built-in widening conversions are the following:

  • Natural to Integer, Float, Whole, or Decimal
  • Integer to Float, Whole, or Decimal
  • Float to Decimal
  • Whole to Decimal

But these conversions aren't defined by special-case rules in the language specification.

Numeric operator semantics

Operators in Ceylon are, in principle, just abbreviations for some expression involving a method call. So the numeric types all implement the Numeric interface, refining the methods plus(), minus(), times(), divided() and power(), and the Invertable interface, refining inverse. The numeric operators are defined in terms of these methods of Numeric. The numeric types also implement the interface Castable, which enables the widening conversions we just mentioned.

shared interface Castable<in Types> {
    shared formal CastValue as<CastValue>()
        given CastValue satisfies Types;
}

The type parameter Types uses a special trick. The argument to Types should be the union of all types to which the implementing type is castable.

For example, simplifying slightly the definitions in the language module:

shared class Natural(...)
        extends Object()
        satisfies Castable<Natural|Integer|Float|Whole|Decimal> &
                  Numeric<Natural> &
                  Invertable<Integer> {
    ...
}
shared class Integer(...)
        extends Object()
        satisfies Castable<Integer|Float|Whole|Decimal> &
                  Numeric<Integer> & 
                  Invertable<Integer> {
    ...
}
shared class Float(...)
        extends Object()
        satisfies Castable<Float|Decimal> &
                  Numeric<Float> & 
                  Invertable<Float> {
    ...
}

These declarations tell us that Integer can be widened to Float, Whole, or Decimal, but that Float can only be widened to Decimal. So we can infer that the expression -1 * 0.4 is of type Float.

Therefore, the definition of a numeric operator like * can be represented, completely within the type system, in terms of Numeric and Castable:

Result product<Left,Right,Result>(Left x, Right y)
        given Result of Left|Right satisfies Numeric<Result>
        given Left satisfies Castable<Result> & Numeric<Left>
        given Right satisfies Castable<Result> & Numeric<Right> {
    return x.as<Result>().times(y.as<Result>());
}

Don't worry too much about the performance implications of all this — in practice, the compiler is permitted to optimize the types Natural, Integer, and Float down to the virtual machine's native numeric types.

The value of all this — apart from eliminating special cases in the language definition and type checker — is that a library can define its own specialized numeric types, without losing any of the nice language-level syntax support for numeric arithmetic and numeric widening conversions.

There's more...

If you're interested, you can check out a complete list of Ceylon's operators along with a discussion of their precedence.

In the Part 11 we're going to come back to the subject of object initialization, and deal with a subtle problem affecting languages like Java and C#.

26 comments:
 
05. May 2011, 05:51 CET | Link

Here's a quote from the language specification that explains the design constraints we've operated under with respect to numeric types:

Ceylon's numeric type system is designed to accommodate the following considerations:
  • The system is extensible and new numeric types may be defined by libraries. Numeric types defined by libraries should work the same as built-in numeric types.
  • Two values of different numeric types should be combinable using operators without the need for explicit type conversions, but
  • implicit type conversions should never lose information and
  • binary operators must be symmetric.
  • Furthermore, there should be no special cases in the language definition or type checker to accommodate numeric types (other than the definitions of the numeric operators).
  • However, it must be possible for the compiler to optimize certain of the built-in numeric types to take advantage of the VM's native support for numeric types.
  • It is not necessary to encode information about the limits of the machine's representation of a numeric value into the type system. A single Integer class is sufficient to represent 32, 64, or 128 bit integer values. A single Float class is sufficient to represent single and double precision floating point values. It is simply not possible for the compiler to detect numeric overflow or loss of significant precision.

Of all the problems we've come across in designing the language, this was for me personally one of the most interesting.

ReplyQuote
 
05. May 2011, 08:57 CET | Link
Thorsten

Thanks for your series of articles about Ceylon. It is much appreciated!

A couple of comments:

for subtypes of IdentifiableObject, x==y should imply x===y

Shouldn't this be the other way around?

----

A very large or small numeric literals may be qualified by one of the standard SI unit prefixes: m, u, n, p, f, k, M, G, T, P.

Nice!

----

Integer vs. Whole: I'd prefer Integer to be of arbitrary-precision and have Int64 and Int32 for restricted integers (making clear their ranges).

----

Furthermore I'd suggest to drop the prefix small, because it is just a hint which can be ignored. That is not really useful (and has haunted C for a long time).

----

N product<X,Y,N>(X x, Y y)
        given N of X|Y satisfies Numeric<N>
        given X satisfies Castable<N> & Numeric<X>
        given Y satisfies Castable<N> & Numeric<Y> {
    return x.as<N>().times(y.as<N>());
}

That's a good idea which I first saw nicely done in the programming language Cecil by Craig Chambers (very nice type system with mutliple dispatch, Cecil Publications).

But shouldn't this read

N abstracts X|Y
instead of
N of X|Y
?

 
05. May 2011, 10:34 CET | Link
Mark
Whole zero = 0.whole; //explicitly widen from Natural

Integer three = +3;

I'd prefer a C like default for signed (they are the most usable default, as in Java) and use 0U (suffix) for Natural / unsigned.

But this syntax kinda conflicts (u/U confusion) with ISO prefixes which are IMO pointless and I'd just remove them (especially since the common ones will result in the 1000/1024 confusion)

 
05. May 2011, 12:26 CET | Link
for subtypes of IdentifiableObject, x==y should imply x===y Shouldn't this be the other way around?

yes, of course, thanks. Fixed.

Furthermore I'd suggest to drop the prefix small, because it is just a hint which can be ignored. That is not really useful (and has haunted C for a long time).

The thing is, I really think optimizations like precision shouldn't be modeled within the numeric types. It's just not an issue of typing.

That's a good idea which I first saw nicely done in the programming language Cecil by Craig Chambers (very nice type system with mutliple dispatch, Cecil Publications).

I will check it out, I had never seen this approach taken elsewhere.

But shouldn't this read
N abstracts X|Y
instead of
N of X|Y
?

It's an enumerated type bound. The expression type N has to be exactly one of the operand types X, Y.

 
05. May 2011, 12:32 CET | Link
I'd prefer a C like default for signed (they are the most usable default, as in Java) and use 0U (suffix) for Natural / unsigned.

The default of Natural is definitely going to cause some controversy. My take is that use of unsigned values in Ceylon is a lot more common than in most other languages because the sequence index type is Natural.

This is something we will need a bit more practical experience with to know if it's truly the right decision or not.

(One possibility would be to try making Natural a subtype of Integer.)

 
05. May 2011, 12:40 CET | Link
Gavin King wrote on May 05, 2011 06:32:
I'd prefer a C like default for signed (they are the most usable default, as in Java) and use 0U (suffix) for Natural / unsigned.
The default of Natural is definitely going to cause some controversy. My take is that use of unsigned values in Ceylon is a lot more common than in most other languages because the sequence index type is Natural.

And unsigned values simply occur very often in the real world. Including anytime you count stuff.

 
05. May 2011, 13:39 CET | Link
Thorsten
Gavin King wrote on May 05, 2011 06:26:
Furthermore I'd suggest to drop the prefix small, because it is just a hint which can be ignored. That is not really useful (and has haunted C for a long time).
The thing is, I really think optimizations like precision shouldn't be modeled within the numeric types. It's just not an issue of typing.

You are right. Instead there should probably be specialized types outside the numeric hierarchy like Byte, Word, Word32 or Word64 (or whatever) for working with the bits in a low-level fashion (where I need to specify the word width).

But shouldn't this read
N abstracts X|Y
instead of
N of X|Y
?
It's an enumerated type bound. The expression type N has to be exactly one of the operand types X, Y.

Ah, I see. Of course. Thanks for clearing that up.

 
05. May 2011, 15:47 CET | Link
Sergej Koščejev

Are hexadecimal literals allowed?

 
05. May 2011, 16:31 CET | Link
Sergej Koščejev wrote on May 05, 2011 09:47:
Are hexadecimal literals allowed?

Not natively, but we're experimenting with the idea of a generic quoted literal syntax for handling things like dates, times, URLs, email addresses, hexadecimal numbers, regular expressions, etc. i.e. a generic way of embedding micro-languages. I'm not totally certain what's the best path to make this work yet, but there are a couple of possibilities. I'll discuss it in a future blog.

 
17. May 2011, 21:20 CET | Link
Gavin King wrote on May 05, 2011 10:31:
Not natively, but we're experimenting with the idea of a generic quoted literal syntax for handling things like dates, times, URLs, email addresses, hexadecimal numbers, regular expressions, etc. i.e. a generic way of embedding micro-languages. I'm not totally certain what's the best path to make this work yet, but there are a couple of possibilities. I'll discuss it in a future blog.

I would love to see support for more literals. One niche language that does this is REBOL, it has literals for times, dates, urls, file paths, binary data, and more: REBOL datatypes

Interestingly, Ceylon reminds me of REBOL a bit already, they also have the aim of being able to write code in a declarative style for GUIs and structured data, while remaining in a general purpose language.

 
18. May 2011, 20:52 CET | Link
And unsigned values simply occur very often in the real world. Including anytime you count stuff.

Yes, but as soon you subtract (compare) your count, you can easily get a signed number.

So, unless the subtraction on unsigned returns signed, it's better to have signed as default.

 
19. May 2011, 17:06 CET | Link
Mark wrote on May 18, 2011 14:52:
And unsigned values simply occur very often in the real world. Including anytime you count stuff. Yes, but as soon you subtract (compare) your count, you can easily get a signed number.

Sure, but if you're going to be dealing with exceptions, wouldn't it be much better to get those exceptions earlier (i.e. where subtraction results in a negative integer) rather than later (i.e. where you try to index a sequence by a negative index).

 
19. May 2011, 17:09 CET | Link
I would love to see support for more literals. One niche language that does this is REBOL

I have not looked at REBOL yet, but will do. What is definitely on the cards is a plugin literal syntax, where you can define your own literal format that will be parsed at compile time. But we haven't really properly figured out the details of how this would work yet, so I have not talked about it here.

Interestingly, Ceylon reminds me of REBOL a bit already, they also have the aim of being able to write code in a declarative style for GUIs and structured data, while remaining in a general purpose language.

Interesting. Yes, in this context, support for more kinds of literals definitely comes up.

 
18. Jul 2011, 13:33 CET | Link
Ivo
Thorsten wrote on May 05, 2011 07:39:
You are right. Instead there should probably be specialized types outside the numeric hierarchy like Byte, Word, Word32 or Word64 (or whatever) for working with the bits in a low-level fashion (where I need to specify the word width).

Yes, I think that is exactly right. There are always low-level applications where you need an efficient way to represent binary data. But it is a good idea to have separate data types for that, simply because the requirements are different (namely arithmetic on the one hand, and representing a specific binary pattern on the other hand).

 
31. Jul 2011, 08:27 CET | Link
woods

How does operator polymorphism work for non-numeric types? Mainly I'm thinking of types from linear algebra (matrices, vectors) A vector has additive (interface Summable) operations (over vectors) and multiplicative (interface Numeric, I guess?) operations (over numbers) but can this be expressed in Ceylon so that things like scaling a vector works (s*V = V*s where s number, V vector)?

 
15. Aug 2011, 14:47 CET | Link

How would you iterate over a completement of a Set? :D

Adam

 
19. Aug 2011, 14:49 CET | Link
Anon

I assume that the type decimal will be implemented using something similar to java.math.BigDecimal. It would be better to delay the implementation of decimal and wait for the JVM to support 128bit values. It would not allow arbitrary precision but a least it would be efficient on modern CPU. If anyone needs arbitrary precision, then he or she can rely on BigDecimal,

 
21. Aug 2011, 22:39 CET | Link

I propose make numerical types hierarchy more math specific and use maths terms as algebraic structures, such as groups, rings, fields, modules, vector spaces, and algebras

It allow to make math packages more ceylon friendly and logical/

 
06. Sep 2011, 19:27 CET | Link
ГОСТ wrote on Aug 21, 2011 16:39:
I propose make numerical types hierarchy more math specific and use maths terms as algebraic structures, such as groups, rings, fields, modules, vector spaces, and algebras

That's certainly something to look into for the future.

 
08. Sep 2011, 02:33 CET | Link
Olivier Pernet

Is there precedent for the name Whole? It sounds a bit unclear to me. How about BigInteger?

 
08. Sep 2011, 18:57 CET | Link
Olivier Pernet wrote on Sep 07, 2011 20:33:
Is there precedent for the name Whole? It sounds a bit unclear to me. How about BigInteger?

As in whole number. The problem with BigInteger, apart from verbosity, is it's actually incorrect. A BigInteger may be big or small, depending.

 
09. Sep 2011, 00:04 CET | Link
Olivier Pernet
Gavin King wrote on Sep 08, 2011 12:57:
Olivier Pernet wrote on Sep 07, 2011 20:33:
Is there precedent for the name Whole? It sounds a bit unclear to me. How about BigInteger?
As in whole number. The problem with BigInteger, apart from verbosity, is it's actually incorrect. A BigInteger may be big or small, depending.

Ah, but according to the Wikipedia article, whole number can ambiguously refer to naturals, naturals without 0, or integers. I would venture to say that BigInteger, if not always accurate, will confuse people less.

Another possibility would be to make both Natural and Integer unbounded, and let the compiler and VM optimize them down to 32 or 64 bits. But the speed cost might be too high.

 
04. Oct 2011, 00:20 CET | Link
Gavin King wrote on Sep 06, 2011 13:27:
ГОСТ wrote on Aug 21, 2011 16:39:
I propose make numerical types hierarchy more math specific and use maths terms as algebraic structures, such as groups, rings, fields, modules, vector spaces, and algebras
That's certainly something to look into for the future.

This lib can help

 
04. Oct 2011, 00:25 CET | Link
ГОСТ wrote on Oct 03, 2011 18:20:
Gavin King wrote on Sep 06, 2011 13:27:
ГОСТ wrote on Aug 21, 2011 16:39:
I propose make numerical types hierarchy more math specific and use maths terms as algebraic structures, such as groups, rings, fields, modules, vector spaces, and algebras
That's certainly something to look into for the future.
This lib can help

This is much better

13. Nov 2011, 03:16 CET | Link

Hello,

One important property of datatypes is their immutability which makes them suitable for reuse (low memory usage), sharing across threads (multithreading), use as keys in HashMaps (datatype consistency), etc. I have not seen anything about immutable types o your blog just yet. Do you have datatype immutability on the cards?

Also, another immutable type is important with interned Strings which can shrink memory usage quite a lot. Ruby and other languages have a nice syntax for interning. Do you plan to also have literal support for interned Strings? Or will you handle this with the pluggable literals API?

Thanks,

Djano

13. Nov 2011, 11:57 CET | Link
Quintesse

@Djano: I can tell you that immutable types are supported, whenever a type inherits directly from Object instead of IdentifiableObject it won't be allowed to have variable attributes. This is for example used in the numeric types and enumerations.

Post Comment