Arrays, and other compound data types

In this section, we will compare C compound data types, like arrays, structs, and unions, with the Scheme compound types.

The creation and access of compound data in C is basically limited to arrays and structs. Arrays are one-dimensional sets where each element has the same data type. structs are groupings of simple data types and compound data types that have a pre-defined structure.

Since the C compiler must always know what type of data is in each variable, structs are always declared before use, such as in the following typedef.

typedef struct
  {
    int quot;                   /* Quotient.  */
    int rem;                    /* Remainder.  */
  } div_t;
  

In Scheme, the compound data types need not have any set structure. They can contain any combination of data in any order.

Because of the versatility of Scheme, one must be careful when passing data back and forth from C to Guile. Guile can not pass compound data structures to C without Guile and C agreeing on their format beforehand.

This chapter will have a discussion of the Guile compound data types, and then show how data can be passed back and forth between C and Guile.

Note that while passing small structs or vectors back and forth between the two is fairly efficient, it is poor form to have code that passes large data structures. This would result in the structures appearing in memory both in the C heap and in Guile-administered memory. For large data, it is best to operate on the data exclusively in C or Guile. For more about that, see the chapter FIXME or SMOBs.

Guile's Compound Data Types

pair

pairs are data containers that hold exactly two elements. The first element is the car and the second element is the cdr. The elements may be any type of Guile data, be it simple or compound.

In practice, a pair is low-level type, and is normally used in the construction of the more commonly used higher-level types, such as lists.

list

The list is Scheme's most common and most important compound data type. It is a one-dimensional vector where each element may be any type of Guile data.

The list is constructed to be as malleable as possible. It can be grown or shrunk, appended, spliced, and chopped. This versatlity comes at a cost, however. It takes longer to operate on elements at the later in the list that those at the beginning.

The elements of a list may be accessed like a C array, or by application of the car and cdr commands. When a list is accessed like a C array, it is zero-indexed, where the first element of the list is element zero, and the last element of a list of length N is element N - 1.

Alternately one can use the car and cdr functions. The car of a list is its first element. The cdr of a list is all the elements except the first one.

By extension, the second element of a list is the car of the cdr of the list, which is called the cadr of the list. The third element of a list is the car of the cdr of the cdr of the list, which is called the caddr of the list.

Everything but the first element is the cdr of the list. Everything but the first two elements is the cdr of the cdr of the list, which is the cddr of the list. Everything but the first three elements is the cdddr, and so on.

vector

The vector is also a one-dimensional collection where each element may be any type of Guile data. Unlike lists, the vector has a set number of elements, and may not grow or shrink.

The vector doesn't suffer from the list problem where it takes longer to reference later elements. It takes an equal amount of time to operate the elements of an vector no matter where they are in the collection.

Vectors are zero-indexed, just like C arrays.

Conventional Arrays

The Guile array is much like the vector in that it is a fixed-size container of objects, except that arrays can be multidimensional.

By default, conventional arrays are zero-indexed, but, they can be declared to have other integer indexing.

Uniform Arrays

Uniform arrays are a special type of array where each element of the collection must have the same type. Because it is not necessary to store information about the type of each element in the collection, uniform arrays take up less memory that conventional arrays.

By default, uniform arrays are zero-indexed, but, they can be defined so that each dimension has a different indexing scheme.

Association Lists

If a vector is a collection where each element is referenced with an index, an association list is a collection where each element is referenced with a string. Association lists, or alists collect data into "keys" and "values", where each value has a key that can be used to address it.

alists are really just lists where each element is a pair. The first element of the pair is the key and the second element of the pair is the value.

Even though they are not truly a separate data type, they are mentioned specifically because there are Guile functions to search and operate on alists.