Handling collection key changes

How do you handle changing values of properties that are used as collection keys in a .NET 1.1 program? I’ve dealt with this problem before in a number of different ways, and none of them are completely satisfactory. Now I am writing the business logic tier of an application and I need to decide whether to take a new approach or not. Let me explain.

Think of a class that represents a business object, such as Book in a bookshop management system. The class Book has several properties that correspond to the attributes of books, such as ISBN, Title and Authors. In addition, all business objects in my application have an integer id, so the class Book has an Id property as well.

Business objects are stored in a database and retrieved from it when needed. Many business logic methods need to return a set of Books, so I create a typed collection to hold books. Let’s call it Books. This typed collection probably inherits from some base class in System.Collections.Specialized, although this is not relevant here. What is relevant is this Books collection must hold a collection of zero, one or more instances of Book, and be able to index them by either an ordinal index (an integer) or the book id. It is easy to use some built-in methods of the classes in System.Collections.Specialized to do this, either through a regular method or an indexer. The Books class would look something like this:

public class Books
  //some code...

  public Book this[int id]
    get{ /* return book with given id */ }

  public Book GetByIndex(int index)
    /* return book at the given position */

  //some more code...

So far so good. Now, I also want my Books collection to be indexed by ISBN. This introduces an issue, which can be explained at an abstract point of view or at a very specific one. From an abstract perspective, the problem is that indexing collection by a piece of data that has meaning to the end user is, in my experience, dangerous. From a more concrete perspective, ISBN is a property of the class Book (which, of course, has meaning to the final user), and we are trying to use this very same ISBN to index collections of books. In principle, I can add a new indexer to my collection, like this:

public Book this[string isbn]
  get { /* return book with given ISBN */ }

How do I implement this? Usually, by deriving my Books class from NameObjectCollectionBase or by using an embedded Hashtable object in the collection class to associate ISBNs to book objects. Each time a new book is added to the collection, I would add a dictionary item to the hashtable, and when a book is removed from the collection, the appropriate entry from the hashtable is also removed. Like this:

private Hashtable m_hashByIsbn = new Hashtable();
public void Add(Book b)
  //some code...

  m_hashByIsbn.Add(b.Isbn, b);

  //some more code...

public void Remove(Book b)
  //some code...


  //some more code...

This way, I can use the hashtable to retrieve books from the collection by their ISBN, in addition to id and index:

public Book this[string isbn]
  get { return (Book)m_hashByIsbn[isbn]; }

Still, so far so good. The problem comes when we realise that ISBNs can change. Although not common, the ISBN property, as we have discussed, has meaning to the final user, and therefore is under their control. Because mistakes and other weird circumstances do happen, it must be possible for instances of Book to change its ISBN. That is, the ISBN property of Book must be read/write, not read-only. Let’s assume that the Book class uses a private field to store its ISBN, and the ISBN public property returns and sets that field. Each time that client code writes to the property, the current value is discarded and a new one stored. However, the book instance being modified may be a member of one or more Books collections, which are using the “old” value of the ISBN in the hashtable to index the book instance. Imagine this:

//my collection of books.
Books bs = new Books();
//let's create an instance and add it to the collection.
Book bNew = new Book("0-07-135895-2", "Fire in the Valley");

//I used the wrong ISBN, so I need to change it.
bNew.Isbn = "0-07-135895-1";

//let's try to retrieve the book from the collection.
Book bRetrieved = bs["0-07-135895-1"];

When I assign a new value to bNew.Isbn, the ISBN of the book actually changes, but the collection bs, of which bNew is a member, is still using the original ISBN value in the hashtable to index the book. Therefore, when I try to retrieve the book from the collection into bRetrieved using the new value of the ISBN, I get a null value. The book is there, but it is indexed through the old ISBN value.

Of course, any book instance can be a member of zero, one or more book collections at any time, and book instances do not keep track of which collections they belong to. If they did, it would be more or less easy to notify each collection of the ISBN change so they can update their hashtables. But making Book depend on Books does not look like neat design to me. To start with, Books depends on Book already (by definition!), so adding the new dependency would create a circular dependency between the two classes. Secondly, different collections classes may exist, such as BooksOfPublisher and BooksOfAuthor, which makes everything more complex. I have discarded the option of having the business object class keep track of the collections it belongs to. The only solution I can think of is to use an event-based mechanism to let collections know about key changes.

My current implementation makes the Book class raise an IsbnChanged event each time the Isbn property changes. This is quite generic and can be also useful for updating user interface elements, for example. When a book is added to a collection, the collection subscribes to the event of the book, and when a book is removed from a collection, the collection unsubscribes. This way, the collection is always notified every time the book changes its ISBN. Although relying on the sequential invocation of event handlers performed by multicast delegates that underpins events has its risks (such as an event handler throwing an exception), I think this approach is good enough for my needs. What do you think? Are there other approaches? Can you see any issues with my approach?

Things get more complex. I also have implemented dual- and triple-keyed collections, which index their members by two and three keys, respectively. For example, consider the Country and PhoneArea classes. Countries have a Prefix property, which is the phone prefix you need to dial to call that country, such as 61 for Australia or 34 for Spain. The PhoneArea class represents a phone prefix area within a country. For example, in Australia, Sydney is 02 and Brisbane is 03. This class also has a Prefix property, which again is the phone prefix for that particular area. For example:

Country c1 = new Country("Australia", 61);
Country c2 = new Country("Spain", 34);
PhoneArea pa1 = new PhoneArea(c1, "Sydney", 02);
PhoneArea pa2 = new PhoneArea(c1, "Brisbane", 03);

As you can see, a relationship exists between the two classes. Every phone area belongs to one and exactly one country. The Country class has a PhoneAreas collection member to expose this:

Country cs = RetrieveAllCountries();
PhoneArea pa = cs[61].PhoneAreas[02];

In this example, I retrieve all the countries as a country collection and then index the collection by prefix to obtain a particular country. In the same line, I use the PhoneAreas property to navigate to the phone area collection of that country, and index it with the phone area prefix to obtain a particular phone area.

Now, sometimes I need to manage a list of phone areas regardless of the country they belong to. And I need a linear list of phone areas; I do not want a hierarchical list of countries with nested phone areas. So I need a new collection class, call it PhoneAreasGlobal, which holds phone areas very much like PhoneAreas. The difference is that PhoneAreasGlobal cannot use the phone prefix to index its members, because phone area prefixes are not guaranteed to be unique across countries. So, it needs to use a combination of the country phone prefix and the area prefix. Like this:

PhoneAreasGlobal pasg = RetrieveAllPhoneAreas();
PhoneArea pa = pasg[61, 02];

In this example, I first obtain a collection of all phone areas, and then retrieve a single phone area from the collection using a double key: the country phone prefix plus the phone area prefix. Internally, this collection just merges both integers to obtain a single integer used as unique index. The problem is that this collection must be notified when a member phone area changes its prefix code (like book collections are notified when member books change their ISBN), and also when any country changes its prefix code. Although the collection does not store country instances at all, it must also subscribe to their prefix change event. So, each time a phone area is added to a phone area global collection, the collection subscribes to the phone area prefix change event and to the phone area’s country prefix change event. Unsubscribing works similarly. And triple-keyed collections follow the same principles, and are useful for hierarchies three levels deep.

What do you think of this approach? Any comments? Thanks!


1 Response to “Handling collection key changes”

  1. 1 rp 21 November 2005 at 1:43

    I think you already provide the answer to your question: don’t use mutable collection keys at all. My own collection classes don’t even expose keys at all: you can iterate over the collection, but you cannot take the index value of an element.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Follow me on Twitter



%d bloggers like this: