Three years of A Technophile’s Indulgence…

Back when I started writing articles for A Technophile’s Indulgence, I was a failing chemistry student whose interest in computers had often left me astray from what I was meant to be studying. While I had been writing articles for other audiences for a couple of years by that stage, A Technophile’s Indulgence was the first time when I set a consistent schedule (or at least mostly consistent – there have been some changes or omissions from the fortnightly schedule, often because of exams) for my writing. Nevertheless, I never would have guessed that I’d still be blogging on this site and consistently writing articles today.

Today, I’m a rather more academically successful student of Information Systems and Information Technology. The adjustment to a computer-related degree course wasn’t really all that unexpected; even when I started writing on this blog, I had expected that I would have to have some sort of contingency plan. What has been somewhat unexpected is the sharp jump in academic performance – from barely passing (or in the case of my first attempt at second year or my attempt at third year, not even then) to consistently high marks, some of them top of my class.

Strictly speaking, this isn’t the third anniversary of when this site was set up; that anniversary occurred about three months ago, commemorating the occasion when I started the site to host the first chapter of a short story I was then writing. However, having grown displeased with my progress on the story and indeed with the material I was writing, I removed the story and decided to repurpose the site as a repository for my first series of articles that I had written after finishing my exams in May 2009.

(Incidentally, that story, called Of Questionable Contact, shared several elements with my currently slowly-progressing short story, The Unexpected Satellite, including a focus on hard science fiction and reasonably realistic technology, the same setting in late 21st century space with the European Commonwealth Space Agency, and even the same main character. A short reference to the intended course of events in that story was included in the introduction to The Unexpected Satellite, which I will continue work on this summer.)

On an anniversary like this, I’m inclined to look at the statistics for my page views and take a look at which aspects of my writing have been most popular and which have been most prominent on search engines. The site statistics reveal that there are two articles which have a substantial proportion – in excess of 10% each – of my site views, with another series of articles with a rather more gradual progression. The graph below shows this phenomenon manifest with my top 10 articles:

ati-stats

Variable Types in C: Auto, Static, Register and External – A
Brief Guide
1820
Revolutionary Technology in Formula One: The Monocoque Chassis 1642
Unicomp Customizer UB434H6 – A Technological Review 593
Probing The Inaccuracies: Mecha 572
FROM THE ARCHIVE: Probing The Inaccuracies – The Automobile 539
Historical Operating Systems: Version 7 Unix 443
My first fortnight with the Raspberry Pi 400
Fundamentals of Binary Input/Output in C 398
Historical Operating Systems – AmigaOS 368
ARM vs. Intel: Clash of the Processing Titans 333

A common theme can quickly be established between most of the top 10 articles – they are all technical in nature, as opposed to the less technical side of the site manifest in my reviews of games and movies, along with my fictional writing. A few other common elements can be identified – with the exception of the post on “My first fortnight with the Raspberry Pi”, which was very much an introspective post, all of the articles are fairly heavy on the technical talk. This is especially true of the posts on C programming, which are straight-up tutorials on how to use various elements on the C programming language, and the Historical Operating Systems series, which I feel has never backed down on what makes an operating system memorable, including the technical aspects.

Most of the posts are also written in a very serious style, the notable exceptions in this list being the Probing The Inaccuracies articles which incorporate a ranty style which was intentionally designed to invoke responses when I was writing for other sources.

I’m somewhat surprised about the volume of page views for the first two posts on the list, particularly the first one. While the post on “Revolutionary Technology in Formula One: The Monocoque Chassis” was always intended as part of a series and I might have expected some interest based on that, I’m surprised that the subject of monocoque chassis layout doesn’t have more representation on Google, given its status as an imperative part of modern automotive design.

Similarly, I’m quite shocked at the page view count for “Variable Types in C: Auto, Static, Register and External – A Brief Guide”. Again, I would really have expected there to be more material present on Google for the subject. What’s more, the article was not originally written for A Technophile’s Indulgence – it was part of a series of articles written as tutorials back when I was studying C in college and found myself rather adept at it while others were struggling.

In any case, I think this shows that continuing along the lines of technical articles might not be a bad idea as I continue to write. Here’s to the three years passed – and perhaps to three or more years into the future!

Advertisements

Basics of Structures in C

With this post, we get to one of the most useful elements of C: Structures. While structures didn’t exist in some of the early versions of C, they were soon added in as the developers of C and the Unix operating system saw how useful they could be for extending the utility of their pre-existing programs, and indeed, developing programs outside of the field of system programming. The C data structure is, along with functions, one of the elements that gives it extra utility compared to the programming languages that came before, like FORTRAN and BASIC.

A structure is, speaking succinctly, a way of wrapping multiple variables up into one “package”. To demonstrate why you might want to do this, we could say that you’re designing a simple data processing application like a rudimentary stock control system for a small shop. This stock control system works on a number of different variables, such as char arrays for storing the names of the products, float variables for storing the cost and sell price of these products, and int variables for storing the number of products left in stock. Without structures, we have to declare all of these variables discretely, as illustrated below:

char *name[100];

int *number;

double *cost;

double *price;

The reason we have declared all of these as pointers is because we will want to declare arrays of these variables, but unless you know exactly how many product types are in your database, you won’t know how much memory to allocate, and so, you will have to use malloc(), calloc() or realloc() to create arrays of the appropriate size.

The problem we have here is that none of the elements of these arrays has any particular connection to the corresponding elements in the other arrays. Let’s say you wanted to sort the products into alphabetical order. You’d perform your sorting algorithm on the name[] array, which would work perfectly well, but if you forgot to appropriately change around the other three arrays, you’d end up with a situation where all of the arrays were knocked out of sync. This demonstrates how fragile this stock control system would be – and how inefficient, as you’d have to appropriately change the positions of three other arrays in addition to the name[] array!

What’s more, this particular stock control program wouldn’t work very well when it came to passing the elements into a function. In order to perform a stock reordering function, what we’d have to do is pass four separate variables into the function. Let’s say that we just wanted to reorder a single product, which might use a function with a prototype like this:

int reorder(char **name, int *number, double *cost, double *price);

Pretty messy, don’t you think? One of the preferable things about structures is that they allow you to send a single variable into this function. The function prototype can be rewritten in the following fashion:

int reorder(struct item *product);

This is a lot cleaner and easier to handle. There’s a single variable in the arguments of type “pointer to struct item“, which contains all of the discrete variables which we had before. I’ll explain more about pointers to structures later. First of all, we’re going to leave our stock control system for a few moments and look at a simpler structure declaration.

One useful application that C can and regularly has been used for is graphical manipulation. In a raster graphical format, such as BMP, JPEG, GIF or PNG, each pixel has an x and a y coordinate. In the light of what we’ve seen above with the lack of robustness with discrete variables, we’re going to want to create a “package” – or structure – to hold both components of a pixel location together. We’ll create a structure type for doing this:

struct pixel {
    int x;
    int y;
};

What we’ve just created is a structure declaration, which essentially says, “Let’s create a new type, called struct pixel, whose elements will be an int variable named x, and an int variable named y“. This structure declaration now allows us to create variables of type struct pixel in other parts of the program.

Once we have the structure declaration, we create an element of this structure type like this:

struct pixel pixel_1;

The element pixel_1 is now created, just as we’d create an int, char or float variable. It follows some of the same rules as any of these other variables, in that it fills a certain amount of space (in this case, as it contains two int variables, its size will be sizeof(2 * int)). However, because it contains more than one variable itself, changing the values of these internal variables involves a different step than we’ve taken before.

To do this, we introduce the structure member operator, “.” (a single full stop). Let’s illustrate how this works by setting the value of the x coordinate in pixel_1 to 150:

pixel_1.x = 150;

The structure member operator is very high in the order of precedence in C; it has a higher precedence than the pointer dereference operator, and equivalent precedence to brackets and square brackets.

We’ll put this all together in a simple program:

#include <stdio.h>

struct pixel {
    int x;
    int y;
};

int main(void)
{
    /* We'll create a variable of type struct pixel */
    struct pixel pixel_1;

    /* Set the elements of pixel_1 /*
    pixel1.x = 200;
    pixel1.y = 350;
    return 0;
}

The elements of a structure can be any legal variable, such as int, char, float and double. Notably, a structure, once it has been declared, is a legal variable. This means that structures are themselves legal elements of a structure – as long as they are structures of a different type. Extending our metaphor of a graphics program, we can use these so-called nested structures to create a representation of a rectangle. Our simple representation of a rectangle will consist of two pixels representing the origin and the diagonally opposite corner:

struct rect {
    struct pixel pt_1;
    struct pixel pt_2;
};

Indeed, the dimensions of a standard computer screen can be represented by a rectangular structure. We will illustrate how we would use our nested structure here for a 1920×1200 screen:

struct rect screen;

screen.pt_1.x = 0;
screen.pt_1.y = 0;
screen.pt_2.x = 1919;
screen.pt_2.y = 1199;

Unfortunately, without graphical functions to act upon these structures, we can’t do much with our pixel and rectangle structures. We return, therefore, to the stock control program described earlier. Having seen some simple structures, we can now design a structure which will represent a data entry.

struct item {
  char name[100];
  int number;
  float cost;
  float price;
};

This structure contains the variables for which we would have had to declare separate arrays in the previous program. In using a structure, we have packaged all of these variables together, giving them a mutual association. We’ll demonstrate how this may be used, given a database containing the following, which we can save to a file:

Bread 12 1.65 2.25
Milk 23 1.40 1.80
Eggs 20 1.60 2.15
Cheese 11 1.85 2.50
Bacon 6 2.65 3.15

Now, we write a program to take these values in and to print them out. This program will take a single additional command-line argument, which will be the file name of the database that we defined above:

#include <stdio.h>

struct item {
    char name[100];
    int number;
    float cost;
    float price;
};

int main(int argc, char *argv[])
{
    struct item item_array[5];
    FILE *ifp;
    int i;

    if (argc != 2) {
	printf("Usage: basic_database <filename>\n");
	return -1;
    } else if ((ifp = fopen(argv[1], "r")) == NULL) {
	printf("Error: Could not open file %s\n", argv[1]);
	return -2;
    } else {
	for (i = 0; i <= 5; i++) {
	    fscanf(ifp, "%s%d%f%f", item_array[i].name,
	    &item_array[i].number,
	    &item_array[i].price, &item_array[i].cost);
	}
	fclose(ifp);
    }

    for (i = 0; i <= 4; i++) {
	printf("%s %5d %8.2f %8.2f\n", item_array[i].name, item_array[i].number,
	item_array[i].price, item_array[i].cost);
    }
    return 0;
}

When we call it with our data from before, we get the following result:

Bread 12 1.65 2.25 
Milk 23 1.40 1.80 
Eggs 20 1.60 2.15 
Cheese 11 1.85 2.50 
Bacon 6 2.65 3.15

So, we’ve defined the framework for a simple database. We could do things with these values, such as searching through them to find the item with the highest price for the consumer:

float highest_price = 0;
int highest_price_pos = 0;

for (i = 0; i < 5; i++) {
    if (item_array[i].price > highest_price) {
	highest_price = item_array[i].price;
	highest_price_pos = i;
    }
}

A problem arises, though, when we look at this in greater detail. What happens if the database – as is very likely – contains more than five items? We don’t want to have to recompile our program every time we get a new item to sell, so we do what we would do if we required additional variables of other types: We use dynamic memory allocation. In order to do this, we will require a pointer to our structure type, but before I change our program, we need to discuss pointers to structures.

Pointers to structures operate similarly to pointers to other variables, in that they store memory addresses to structures. They are also declared similarly, with the following example being a pointer to type struct item:

struct item *item_pointer;

However, the order of precedence in C creates a bit of a problem when it comes to using the traditional dereference operator to access the contents of a pointer to a structure. The traditional syntax for accessing data in the elements of a pointer to a structure looks peculiar and confusing:

item_pointer = item_array[3];

(*item_pointer).price = 2.50;

Note that we have to use brackets around the dereference operator and the thing it’s dereferencing. The order of precedence for the structure member operator is higher than that of the dereference operator, and therefore we have to use brackets, which have the same precedence as the structure member operator, in order to allow the pointer to access its memory address before the structure member is called.

Luckily, there is a separate operator used specifically for allowing a pointer to a structure to access its elements without having the potential source of confusion caused by the dereference operator in this circumstance. The structure pointer operator, -> (a minus sign, followed by a greater-than sign), has the same precedence as the structure member operator, and reduces the ambiguity existing with the traditional syntax.

Having discussed that, we can modify our program from before to dynamically allocate memory based on the number of elements contained in the list. We will use malloc() to initially allocate memory for a single element of type struct item *, then use fscanf() to read in the values into the elements of this structure. For each line that fscanf() doesn’t read in the EOF character, the program increases the size of the dynamically allocated memory by the size of the struct item type using the realloc() function. Then, the elements of the structure array are read out as above.

#include <stdio.h>
#include <stdlib.h>

struct item {
    char name[100];
    int number;
    float cost;
    float price;
};

int main(int argc, char *argv[])
{
    struct item *item_array;
    FILE *ifp;
    int i;
    int array_elements = 0;
    int el_size;

    /* Store the size of a single element */
    el_size = sizeof(struct item);

    /* Start off by declaring a single element */
    item_array = (struct item *) malloc (el_size);

    if (argc != 2) {
	printf("Usage: basic_database <filename>\n");
	return -1;
    } else if ((ifp = fopen(argv[1], "r")) == NULL) {
	printf("Error: Could not open file %s\n", argv[1]);
	return -2;
    } else {
	/* Keep going until fscanf hits the EOF character */
	while (fscanf(ifp, "%s%d%f%f", item_array[array_elements].name,
		      &item_array[array_elements].number,
		      &item_array[array_elements].cost,
		      &item_array[array_elements].price) != EOF) {
	    /* Reallocate memory, then increment the elements counter */
	    /* NOTE: We've got a bodge in here. The memory allocated will
	     * always be one element more than we need. */
	    item_array = (struct item *)
		realloc(item_array, (el_size * ++array_elements) + el_size);
	    if (item_array == NULL) {
		printf("Error: Cannot allocate memory\n");
		return -1;
	    }
	}
	fclose(ifp);
    }
    for (i = 0; i < array_elements; i++) {
	printf("%s %d %.2f %.2f\n", item_array[i].name, item_array[i].number,
	       item_array[i].cost, item_array[i].price);
    }

    free(item_array);
    return 0;
}