Basics of Structures in C

With this post, we get to one of the most useful elements of C: Structures. While structures didn’t exist in some of the early versions of C, they were soon added in as the developers of C and the Unix operating system saw how useful they could be for extending the utility of their pre-existing programs, and indeed, developing programs outside of the field of system programming. The C data structure is, along with functions, one of the elements that gives it extra utility compared to the programming languages that came before, like FORTRAN and BASIC.

A structure is, speaking succinctly, a way of wrapping multiple variables up into one “package”. To demonstrate why you might want to do this, we could say that you’re designing a simple data processing application like a rudimentary stock control system for a small shop. This stock control system works on a number of different variables, such as char arrays for storing the names of the products, float variables for storing the cost and sell price of these products, and int variables for storing the number of products left in stock. Without structures, we have to declare all of these variables discretely, as illustrated below:

char *name[100];

int *number;

double *cost;

double *price;

The reason we have declared all of these as pointers is because we will want to declare arrays of these variables, but unless you know exactly how many product types are in your database, you won’t know how much memory to allocate, and so, you will have to use malloc(), calloc() or realloc() to create arrays of the appropriate size.

The problem we have here is that none of the elements of these arrays has any particular connection to the corresponding elements in the other arrays. Let’s say you wanted to sort the products into alphabetical order. You’d perform your sorting algorithm on the name[] array, which would work perfectly well, but if you forgot to appropriately change around the other three arrays, you’d end up with a situation where all of the arrays were knocked out of sync. This demonstrates how fragile this stock control system would be – and how inefficient, as you’d have to appropriately change the positions of three other arrays in addition to the name[] array!

What’s more, this particular stock control program wouldn’t work very well when it came to passing the elements into a function. In order to perform a stock reordering function, what we’d have to do is pass four separate variables into the function. Let’s say that we just wanted to reorder a single product, which might use a function with a prototype like this:

int reorder(char **name, int *number, double *cost, double *price);

Pretty messy, don’t you think? One of the preferable things about structures is that they allow you to send a single variable into this function. The function prototype can be rewritten in the following fashion:

int reorder(struct item *product);

This is a lot cleaner and easier to handle. There’s a single variable in the arguments of type “pointer to struct item“, which contains all of the discrete variables which we had before. I’ll explain more about pointers to structures later. First of all, we’re going to leave our stock control system for a few moments and look at a simpler structure declaration.

One useful application that C can and regularly has been used for is graphical manipulation. In a raster graphical format, such as BMP, JPEG, GIF or PNG, each pixel has an x and a y coordinate. In the light of what we’ve seen above with the lack of robustness with discrete variables, we’re going to want to create a “package” – or structure – to hold both components of a pixel location together. We’ll create a structure type for doing this:

struct pixel {
    int x;
    int y;
};

What we’ve just created is a structure declaration, which essentially says, “Let’s create a new type, called struct pixel, whose elements will be an int variable named x, and an int variable named y“. This structure declaration now allows us to create variables of type struct pixel in other parts of the program.

Once we have the structure declaration, we create an element of this structure type like this:

struct pixel pixel_1;

The element pixel_1 is now created, just as we’d create an int, char or float variable. It follows some of the same rules as any of these other variables, in that it fills a certain amount of space (in this case, as it contains two int variables, its size will be sizeof(2 * int)). However, because it contains more than one variable itself, changing the values of these internal variables involves a different step than we’ve taken before.

To do this, we introduce the structure member operator, “.” (a single full stop). Let’s illustrate how this works by setting the value of the x coordinate in pixel_1 to 150:

pixel_1.x = 150;

The structure member operator is very high in the order of precedence in C; it has a higher precedence than the pointer dereference operator, and equivalent precedence to brackets and square brackets.

We’ll put this all together in a simple program:

#include <stdio.h>

struct pixel {
    int x;
    int y;
};

int main(void)
{
    /* We'll create a variable of type struct pixel */
    struct pixel pixel_1;

    /* Set the elements of pixel_1 /*
    pixel1.x = 200;
    pixel1.y = 350;
    return 0;
}

The elements of a structure can be any legal variable, such as int, char, float and double. Notably, a structure, once it has been declared, is a legal variable. This means that structures are themselves legal elements of a structure – as long as they are structures of a different type. Extending our metaphor of a graphics program, we can use these so-called nested structures to create a representation of a rectangle. Our simple representation of a rectangle will consist of two pixels representing the origin and the diagonally opposite corner:

struct rect {
    struct pixel pt_1;
    struct pixel pt_2;
};

Indeed, the dimensions of a standard computer screen can be represented by a rectangular structure. We will illustrate how we would use our nested structure here for a 1920×1200 screen:

struct rect screen;

screen.pt_1.x = 0;
screen.pt_1.y = 0;
screen.pt_2.x = 1919;
screen.pt_2.y = 1199;

Unfortunately, without graphical functions to act upon these structures, we can’t do much with our pixel and rectangle structures. We return, therefore, to the stock control program described earlier. Having seen some simple structures, we can now design a structure which will represent a data entry.

struct item {
  char name[100];
  int number;
  float cost;
  float price;
};

This structure contains the variables for which we would have had to declare separate arrays in the previous program. In using a structure, we have packaged all of these variables together, giving them a mutual association. We’ll demonstrate how this may be used, given a database containing the following, which we can save to a file:

Bread 12 1.65 2.25
Milk 23 1.40 1.80
Eggs 20 1.60 2.15
Cheese 11 1.85 2.50
Bacon 6 2.65 3.15

Now, we write a program to take these values in and to print them out. This program will take a single additional command-line argument, which will be the file name of the database that we defined above:

#include <stdio.h>

struct item {
    char name[100];
    int number;
    float cost;
    float price;
};

int main(int argc, char *argv[])
{
    struct item item_array[5];
    FILE *ifp;
    int i;

    if (argc != 2) {
	printf("Usage: basic_database <filename>\n");
	return -1;
    } else if ((ifp = fopen(argv[1], "r")) == NULL) {
	printf("Error: Could not open file %s\n", argv[1]);
	return -2;
    } else {
	for (i = 0; i <= 5; i++) {
	    fscanf(ifp, "%s%d%f%f", item_array[i].name,
	    &item_array[i].number,
	    &item_array[i].price, &item_array[i].cost);
	}
	fclose(ifp);
    }

    for (i = 0; i <= 4; i++) {
	printf("%s %5d %8.2f %8.2f\n", item_array[i].name, item_array[i].number,
	item_array[i].price, item_array[i].cost);
    }
    return 0;
}

When we call it with our data from before, we get the following result:

Bread 12 1.65 2.25 
Milk 23 1.40 1.80 
Eggs 20 1.60 2.15 
Cheese 11 1.85 2.50 
Bacon 6 2.65 3.15

So, we’ve defined the framework for a simple database. We could do things with these values, such as searching through them to find the item with the highest price for the consumer:

float highest_price = 0;
int highest_price_pos = 0;

for (i = 0; i < 5; i++) {
    if (item_array[i].price > highest_price) {
	highest_price = item_array[i].price;
	highest_price_pos = i;
    }
}

A problem arises, though, when we look at this in greater detail. What happens if the database – as is very likely – contains more than five items? We don’t want to have to recompile our program every time we get a new item to sell, so we do what we would do if we required additional variables of other types: We use dynamic memory allocation. In order to do this, we will require a pointer to our structure type, but before I change our program, we need to discuss pointers to structures.

Pointers to structures operate similarly to pointers to other variables, in that they store memory addresses to structures. They are also declared similarly, with the following example being a pointer to type struct item:

struct item *item_pointer;

However, the order of precedence in C creates a bit of a problem when it comes to using the traditional dereference operator to access the contents of a pointer to a structure. The traditional syntax for accessing data in the elements of a pointer to a structure looks peculiar and confusing:

item_pointer = item_array[3];

(*item_pointer).price = 2.50;

Note that we have to use brackets around the dereference operator and the thing it’s dereferencing. The order of precedence for the structure member operator is higher than that of the dereference operator, and therefore we have to use brackets, which have the same precedence as the structure member operator, in order to allow the pointer to access its memory address before the structure member is called.

Luckily, there is a separate operator used specifically for allowing a pointer to a structure to access its elements without having the potential source of confusion caused by the dereference operator in this circumstance. The structure pointer operator, -> (a minus sign, followed by a greater-than sign), has the same precedence as the structure member operator, and reduces the ambiguity existing with the traditional syntax.

Having discussed that, we can modify our program from before to dynamically allocate memory based on the number of elements contained in the list. We will use malloc() to initially allocate memory for a single element of type struct item *, then use fscanf() to read in the values into the elements of this structure. For each line that fscanf() doesn’t read in the EOF character, the program increases the size of the dynamically allocated memory by the size of the struct item type using the realloc() function. Then, the elements of the structure array are read out as above.

#include <stdio.h>
#include <stdlib.h>

struct item {
    char name[100];
    int number;
    float cost;
    float price;
};

int main(int argc, char *argv[])
{
    struct item *item_array;
    FILE *ifp;
    int i;
    int array_elements = 0;
    int el_size;

    /* Store the size of a single element */
    el_size = sizeof(struct item);

    /* Start off by declaring a single element */
    item_array = (struct item *) malloc (el_size);

    if (argc != 2) {
	printf("Usage: basic_database <filename>\n");
	return -1;
    } else if ((ifp = fopen(argv[1], "r")) == NULL) {
	printf("Error: Could not open file %s\n", argv[1]);
	return -2;
    } else {
	/* Keep going until fscanf hits the EOF character */
	while (fscanf(ifp, "%s%d%f%f", item_array[array_elements].name,
		      &item_array[array_elements].number,
		      &item_array[array_elements].cost,
		      &item_array[array_elements].price) != EOF) {
	    /* Reallocate memory, then increment the elements counter */
	    /* NOTE: We've got a bodge in here. The memory allocated will
	     * always be one element more than we need. */
	    item_array = (struct item *)
		realloc(item_array, (el_size * ++array_elements) + el_size);
	    if (item_array == NULL) {
		printf("Error: Cannot allocate memory\n");
		return -1;
	    }
	}
	fclose(ifp);
    }
    for (i = 0; i < array_elements; i++) {
	printf("%s %d %.2f %.2f\n", item_array[i].name, item_array[i].number,
	       item_array[i].cost, item_array[i].price);
    }

    free(item_array);
    return 0;
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: